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It is well-known that pure quantum states are typically almost maximally entangled, and thus have 

close to maximally mixed subsystems. We consider whether this is true for probabilistic theories 

more generally, and not just for quantum theory. We derive a formula for the expected purity of a 

subsystem in any probabilistic theory for which this quantity is well-defined. It applies to typical 

entanglement in pure quantum states, coin tossing in classical probability theory, and randomization 

in post-quantum theories; a simple generalization yields the typical entanglement in (anti)symmetric 

,__( quantum subspaces. The formula is exact and simple, only containing the number of degrees of 

y—^ freedom and the information capacity of the respective systems. It allows us to generalize statistical 

f~^ physics arguments in a way which depends only on coarse properties of the underlying theory. The 

fSJ proof of the formula generalizes several randomization notions to general probabilistic theories. This 

I includes a generalization of purity, contributing to the recent effort of finding appropriate generalized 

^ entropy measures. 
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I. INTRODUCTION 

It is increasingly recognized that entanglement is ubiquitous, as opposed to a rare resource that is difficult to create. 
In fact most unitary time evolutions (in a sense to be made precise later) generate a large amount of entanglement 



within a closed quantum system. This turns out to be equivalent to saying that pure quantum states are typically almost 
maximally entangled. 

This striking observation was already made decades ago, see e.g. lUSl, although it was initially phrased as 'sub- 
system entropy typically being maximal' — this was before subsystem entropy became the canonical measure of en- 
tanglement for pure states. The observation and its subsequent refinements have helped us understand more about 
entanglement and its role in information processing [1-14J as well as statistical mechanics ||T5H23l . For example, 
bearing the above in mind it is not surprising that the difficulty for an experimenter trying to perform e.g. quantum 
computing is not to generate entanglement but to control what is entangled with what, and in particular to avoid 
entanglement between the experiment and the environment, as that will increase the entropy of the system. 

Here we show that this observation is an instance of a more universal phenomenon which appears in a wide class 
of probabilistic theories: systems typically randomize locally if a global transformation is applied. More specifically, 
the expected amount of randomization can be expressed by a simple formula, w^hich is universally valid for any 
probabilistic theory satisfying a small set of requirements. The formula describes classical corn tossing as well as 
typical entanglement in quantum and possible post-quantum theories, and has a particularly simple form which 
does not depend on the details of the theory. 

We work in the framework of generalized probabilistic theories, also known as the "convex framework". This 
amounts to taking an operational pragmatic point of view that the physical content of a theory is the predictions 
of outcome statistics, conditional on the experimental settings. A wide range of theories can be formulated in this 
framework, including quantum theory and classical probability theory. 

We ask how pure or mixed subsystems tend to be in such theories, if the global state is drawn randomly (possibly 
subject to some constraints). To make the question well-defined, we add some additional restrictions on the set 
of theories, including crucially that all pure states are connected by reversible d5mamics. Our main result is to 
give a simple expression for the expected value of the purity of a subsystem in such probabilistic theories. The 
expression shows that, in certain limits, subsystems are typically close to maximally random. (In the case of pure 
global quantum states this is equivalent to saying that the states are t3rpically close to maximally entangled). 

Our result unifies several instances of randomization associated with different theories. It also clarifies which 
features of the theory are behind this phenomenon and govern the strength with which it occurs. Some of the tech- 
niques invented in the proof are in addition interesting in themselves. These include generalizations of the notions 
of purity and of Pauli operators to general probabilistic theories. The proof is moreover guided by an intuitive 
Heisenberg-picture argument w^hich is different to the standard arguments for the quantum case and arguably adds 
to our understanding of the quantum result. 

We apply the result to generalize a specific statistical mechanical argument employing typical entanglement which 
is related to the second law of thermod5n-iamics. We moreover calculate the typical subsystem purity in a variety of 
cases, including t5rpical entanglement of pure symmetric and antisymmetric bipartite quantum states, which is to 
our knowledge also a new contribution. 

The presentation is divided into two parts. It should be possible for readers not wishing to familiarize themselves 
with general probabilistic theories to only read the first part. The first part describes the main results and their 
implications with an emphasis on the quantum and classical cases. In the second part we deal with the general 
probabilistic case. 

II. MAIN RESULTS AND OVERVIEW 

One of our main results is an identity which relates simple properties of state spaces to the randomization of 
subsystems. Suppose that Alice and Bob hold a bipartite system AB (for example, a composite quantum system 
A ig) B). They draw a biparte state uj^-^ at random; it may be a random pure state, or a random mixed state with fixed 
purity V{u} )■ Then, the reduced state at Alice, oj"^, will in general be mixed: its expected purity turns out to be 

The parameters Kj^ and Ny^ denote the state space dimension and information carrying capacity of A, respectively 
(similarly for B). It will turn out that this simple formula describes the typical amount of entanglement in random pure 
quantum states (in particular the fact that most quantum states are almost maximally entangled), and at the same time 

classical coin tossing. 

Moreover, this identity describes randomization in possible probabilistic theories beyond quantum theory. It 



shows that very coarse properties of a theory are sufficient to determine its randomization power - basically, the 
ratio between the total number of degrees of freedom K versus the number of perfectly distinguishable states N . 
A generalization of this identity gives the expected amount of entanglement in symmetric and antisymmetric sub- 
spaces, a quantum result that seems to be new as well. 

In this section, we give a self-contained and elementary statement of our results: 



• First, we outline how we define the purity V in general (Subsection II A I, and we explain the notions of "state 



space dimension K" and "capacity N" (in Subsection II B I. 



Then we demonstrate how our result unifies typical quantum entanglement and coin tossing (Subsection II C I 
into a single identity, and we apply a simple generalization of this result to compute the average entanglement 
in symmetric and antisymmetric quantum subspaces (Subsection |II D[. 



• In Subsection HE we apply our results to statistical physics. We argue that the results contribute to a theory- 
independent understanding of some aspects of thermalization and the second law, which may be applied in 
situations like black hole thermodynamics w^here the underlying probabilistic theory is not fully known. 



Finally, we give a simple proof of the quantum case in Subsection II F which also illustrates the main ideas of 



the more general proof in Section III 



The detailed mathematical calculations and results are given in Section IllT] The main result is Theorem 29 which 
contains the exact list of assumptions w^hich must be satisfied for eq. lllj to hold. There is also a more generalversion 

22} . An even more general version 



of this result which needs less assumptions, but is slightly less intuitive (Theorem ; 

concerns random states under constraints (Theorem |34|; this one can be used to derive the average entanglement in 

(anti)symmetric subspaces in quantum theory. 

Sectionlniluses the mathematical framework of general probabilistic theories, as explained for example in |'24','25l. 
Several results in this section are of independent interest in this framework. In particular, we introduce and analyze a 
general-probabilistic notion of purity. Due to its group-theoretic origin, purity satisfies several interesting identities. 
It can be seen as an easy-to-compute replacement for entropy, and has several advantages over recently proposed 



entropy measures for probabilistic theories (cf . Subsection IIIC| 



The remainder of this section does not assume familiarity with general probabilistic theories. 

A. Purity 

In quantum theory, the standard notion of purity of a density matrix p with eigenvalues {\i} is Tr(p'^) — ^ ■ Af . 
This quantity has an operational meaning as the probability that two successive measurements on two identical 
copies of p give the same outcome, if one measures in the basis w^here p is diagonal, i.e. in the minimal uncertainty 
basis. It is therefore sometimes called the collision probability. 

In this work, it will turn out to be extremely useful to rescale this quantity slightly. For density matrices p on C", 
we define 

r) 1 

np)-—,Mp^)-—,. (2) 

For a qubit {n = 2), this quantity has a nice geometrical interpretation in the Bloch ball: it is the squared length of the 
Block vector which corresponds to p. For all dimensions n, 

P(p)^( ^ if pis pure, 
^'^^ 10 if p is the maximally mixed state. 

The definition above applies to quantum theory, where states are density matrices on a Hilbert space. However, 
we can also consider classical probability theory (CPT) instead, w^here states are simply probability distributions, p = 
(pi, . . . ,p„). In analogy to the quantum definition, we set 

n — 1 -^ — ' n — 1 



In CPT, pure states are probability distributions like p = (1, 0, . . . , 0) (one unity, all others zero), and the maximally 
mixed state isp= (^,...,;i-). Therefore, eq. (B| is still valid. 

How can these definitions be naturally generalized to other possible probabilistic theories? (Readers who are not 
so interested in the framework of general probabilistic theories may now safely proceed to the next subsection.) In 
the quantum case, the standard notion of purity can be expressed as 

Tr(p2) = (p,p), 

w^here {X, Y) := Tr(Xy) is the Hilbert-Schmidt inner product on the real vector space of Hermitian matrices. This 
inner product is very special: it is invariant with respect to unitary transformations U, that is, {h{{X),U{Y)) = 
{X, Y), where we used the abbreviation W(X) := UXU^ . This suggests the following strategy for defining purity in 
general probabilistic theories: Find an inner product (•, •) on the state space which is invariant with respect to all reversible 
transformations, and define the purity of a state a; as {uj, uj). 

To make this idea work, we have to be careful, though: even in quantum theory, the invariant inner product 
is not unique in the first place. This is due to the fact that the space of Hermitian matrices V decomposes into 
V = span{/i} V, where /i = I/n is the maximally mixed state, and V is the subspace of traceless Hermitian 
matrices. These two subspaces are both invariant with respect to unitaries. Thus, group representation theory ESI 
tells us that there are infinitely many invariant inner products. 

We can fix this problem by subtracting away the maximally mixed state /i: if p is a density matrix, we define the 
corresponding "Bloch vector" p := p — /i. This is an element of the traceless Hermitian matrices V, and that subspace 
cannot be further decomposed into invariant subspaces. Thus, there is a unique inner product (•, •) (up to a constant 
factor) on V, and we define 7'(p) := (/5, p), rescaling the inner product such that 7'(p) = 1 for pure states p. It turns 
out that {X, Y) = n/{n - 1) Tt{XY) for X.Y eV, and so this definition agrees with eq. 0. 

In Section llll| we apply exactly the same construction to define purity in general probabilistic theories (cf. Defi- 
nition [TI, under some assumptions on the probabilistic theory which are necessary to get a useful definition. The 
resulting purity notion will, in particular, still satisfy eq. l|3}. 



B. State space dimension K and capacity 



N 



For every state space, we denote by K the number of real parameters required to describe an unnormalized, mixed state, 
whereas N denotes the maximal number of (normalized) states that can be perfectly distinguished in a single measurement. 
These quantities were to our knowledge first introduced by Wootters and Hardy [27. 28 1. 

As a simple example, consider a single quantum bit (qubit). Arbitrary mixed states of a qubit are described by 

density matrices P = { ■ ], vvhere normalization Trp = 1 demands that w + x = 1. Since we are 

interested in unnormalized states, we may drop this condition. As a result, unnormalized states p are described by 
four real parameters w,x,y,z. (Positivity of the matrix adds additional constraints in the form of inequalities, but 
the set of matrices fulfilling these conditions is still four-dimensional.) That is, we have K — A. On the other hand, if 
we want to distinguish two states p, a perfectly in a single-shot measurement, they must be orthogonal. Since there 
are only two orthogonal states on C'^, the capacity if the qubit state space is iV = 2. 

For all state spaces of quantum theory, capacity A^ equals the Hilbert space dimension (i.e. the states live on C^), 
and we have the relation K = N'^. 

In classical probability theory, the state space with N perfectly distinguishable configurations consists of the prob- 
ability distributions p — {pi, . . . , pn) with J^i Pi = ^- Dropping normalization, these are N real parameters pi , . . . , pn 
to specify a state. That is, classical state spaces have K = N,m contrast to quantum theory. 

For other general probabilistic state spaces, state space dimension K and capacity N can basically be arbitrary 
natural numbers, only the relation K > N is always true. We give a rigorous mathematical definition of both 



quantities in Section III 



C. Unifying typical entanglement and coin tossing 

We will now show that eq. lITll describes both typical entanglement of random pure quantum states and classical 
coin tossing at the same time. This will be demonstrated by considering three special cases of eq. lllj. 



Random pure quantum states. Suppose we draw a pure state uj^^ on A® i? at random, where A and B are Hilbert 
spaces of dimensions Na and Nb respectively. Recalling eq. p) and IBb and the fact that K — N^ in quantum theory, 
our main formula lITl yields 

f Na r^__ r/ .A^2^ 1 \ ^l - 1 NaNb - 1 Na + 1 Ns^oo J_ 

NaNb + 1 ^ Nb' 



E„P(.-) ^ E„ ^^ I.. [,.-)=] - ,^j . ^.^^ . ^^^^ 



Recall that V{uj"^) is one if and only if uj^ is pure, and it is zero if and only if oj^ is the maximally mixed state. Now 
if the "bath" B becomes large, we see that the expected purity of the local reduced state on A gets closer and closer 
to zero, so that cj"^ gets close to maximally mixed. This expresses the fact that random pure quantum states are typically 
almost maximally entangled, if the bipartition is taken with respect to a small subsystem. [59] 

By typicality, at this point, we mean something very specific. Suppose we want to generate a random state uj^^ 
with fixed purity V{uj ) =: Va (in this case Va = 1, since we are interested in random pure states). We do this by 
choosing a fixed state ip^^ with purity Vi^p^^) = Vo, and then apply a random reversible transformation T to it, 
getting oj^^ := Tip^^ . The transformation T is picked according to the invariant measure (Haar measure) on the 
group of reversible transformations. In the quantum case, the Haar measure on unitaries is also called the unitary 
circular ensemble. See 129| for an explicit recipe for how to pick unitaries numerically in this manner. 

So far, our formula only expresses the expectation value of the local purity VIuj^). To call this value the typical value 
one needs to show that the distribution is peaked around the mean. Intuitively this must be the case if the expected 
value is close to the minimum allowed, as that could only occur if almost all of the distribution is concentrated close 
to the minimum. A simple way to see that this is indeed the case is to apply Markov's inequality 1301 , which in this 
case reads (for x > 1) 

V x] Nb 

This shows that if the mean is small, the probability oi'P{uj^) deviating from it must be small. Stronger results of this 
kind can be obtained from measure concentration theorems on Lie groups 1311 , but we will not pursue this approach 
further in this paper. 

Random pure classical states. What if we draw random pure bipartite states in classical probability theory? In 
this case, purity is defined by eq. lEl, and, as discussed above, state space dimension and capacity are equal: K = N. 
Thus, our main formula Im yields Tor the local marginal uj^ = {(jJ^, . . . , cj^ ) the result 

^ ' "^yNA-lfri- ' ' Na-I) KaKb-1 Na-1 

All the terms cancel, and we get that the expected local reduced purity equals unity. Since 1 is the maximal value, 
this is only possible if in fact V{uj^) — 1 for all pure states u^^ . In other words: all pure bipartite states have 
pure marginals. This expresses the simple fact that there are no entangled states in classical probability theory - all pure 
bipartite states are product states. 

Before turning to the more interesting example of classical coin tossing, we briefly discuss how classical probability 
distributions w of fixed purity V{lo^^) —: Vq are drawn at random (in this example, so far, we have the case 
Vo = 1). In analogy to the quantum case, we start with an arbitrary fixed bipartite probability distribution ip'^-^ 
with purity 'P{ip^^) = Vq. In classical probability theory, the reversible transformations are the permutations, that is, 
doubly stochastic matrices containing only ones and zeroes. Now the state uo^^ is defined as u^^ :— T^p^^ , where 
T is a random permutation. 

Classical coin tossing. We can use identity lll| to describe the process of coin tossing in classical probability theory. 
Suppose we start with a coin (that is, a classicalbit) whose value ("heads" or "tails" - say, heads) is perfectly known 
to us. In this case, the (pure) state of the coin is a probability distribution p^ = (1, 0), where 1 is the probability of 
heads and the probability of tails. However, the environment B is not known to us - it is in some mixed state p^. 
The total state (coin and environment) is thus in a mixed state p^^ := p^ ® p^ , with some purity V{(p^^) =: Pq < 1- 

Now we toss the coin - that is, we flip it in an uncontrolled way which makes it interact with the environment. 
It makes sense to model this process as a random reversible transformation (permutation) T of the global system 
AB. In the end, we capture the coin, cover it with our hand (so that we cannot see what side is up) and disregard 



6 

the environment. The state of the coin is then lu^, the marginal corresponding to the global state u)^^ := Tip^^ . We 
expect that the coin's state should be mixed. In fact, 

where the same cancellation as in the previous example applies. That is, our ignorance about the environment gets 
transferred to the coin, which is exactly what coin tossing is all about. 

D. Typical entanglement of symmetric and antisymmetric states 

So far, our discussion only covered the case that a state is drawn randomly from the set of all states (subject to 
fixed purity). However, there are situations - particularly in thermodynamics, as we discuss in the next subsection - 
where one would like to draw random states subject to additional constraints. 



This generalization is treated in Subsection III G w^here we compute the expected subsystem purity for random 
states that satisfy certain symmetry constraints (Theorem [34). In the special case of quantum theory, this gives the 
typical entanglement for subspaces S C AB that have theTollowrng symmetry property: For every unitary U on A, 
there is a unitary U' on B such that U ®U' preserves the subspace S. An explicit formula for the expected purity of the 



reduced state is given in Theorem 36 



Here, we apply this result to compute the typical amount of entanglement in pure states of symmetric and anti- 
symmetric subspaces: they are both U (E) [/-invariant. As usual, for a Hilbert space H, the symmetric subspace HVH 
resp. antisymmetric subspace H /\H are defined as those vectors \tp) with Trj-!/;) = j?/;) resp. ttIi/j) = —\i^), where 
TT is the unitary that swaps the two particles. For three and more particles, the totally (anti)symmetric subspace is 
defined as the set of vectors that satisfy this equation for all pairs of particles simultaneously. Investigating this 
case is motivated by the importance of identical bosons and fermions which, by the symmetrisation postulate, have 
symmetric and antisymmetric joint states respectively 132[ |33l . 

States such as antisymmetric fermionic states are clearly entangled in the mathematical sense, but we note that 
they can only be termed entangled in the operational sense under some additional assumptions. Standard entangle- 
ment theory implicitly assumes that different systems corresponding to different tensor factors can be operationally 
distinguished, w^hich is in general not true for bosons and fermions. However, w^hilst e.g. two electrons are always 
indistinguishable, they can in fact be treated as distinguishable if they are localized in two separate spatial loca- 
tions [32, 33 1 . This fact gives rise to a natural scenario w^here antisymmetric states appear that are entangled in the 
operational sense: If two such localized electrons had previously shared the same spatial part of the wavefunction, 
their internal degrees of freedom (spin) would have been antisymmetric, and remain so unless altered. After sep- 
arating the two electrons, one would have obtained standard (not just mathematical) entanglement between them, 
having arisen due to the antisymmetry requirement on their joint state (see eg. Il34l - l36l ). One may, for concreteness, 
think about our calculations in this section with such a scenario in mind. 

Theorem 1. Consider the symmetric and antisymmetric subspaces S± on two n-level quantum systems A = B = C", i.e. 
S+ — C" V C" and S*- = C" A C". Ifu}± € S± is a random pure quantum state, then the expected local purity is 

, ,,. 2(n±l) 

E^±Tr '■■''' - ^ ' 



J /t _1_ /fc "T Zj 

4 ) from the corresponding subspc 
described by the same equation, only the factor 2 in the numerator has to be replaced by 1 + Tr(w|). 



n^ ± n + 2 
Moreover, drawing a random mixed state of fixed purity Tr(w|)/roOT the corresponding subspace, the expected local purity is 



Proof. We use Theorem 36 from Subsection III G this theorem is applicable because the symmetric and antisymmetric 
subspace are both invariant with respect to transformations of the form U (E)U.ln the following, we sketch the proof 
for the symmetric subspace S :— S+; the proof for the antisymmetric case is completely analogous. 



According to the notation of Theorem 36 we have A^^ = n and Ns = dim(S'_|_) = n{n + l)/2. Since the case n = 1 
is trivial, we may assume that n > 2. Denote orthonormal basis vectors of A = C"by |1), |2), . . . , \n). We may choose 
the matrix Ea as Ea ■— 4= 1 1) (1| — y=\2) (2|; it satisfies Tr Ea = and Tr E\ = 1 as required. An orthonormal basis 

of S consists of the vectors \ii) with 1 < i < n, and -y= {\ij) + \ji)) for i < j. This allows us to write the projector n 
onto S as 

" 1 11 

i=l i<j i,j i,j 



Using these expressions, the calculation of Tr [(7r(£'^ (g) Is)7r)^is lengthy but straightforward. The result is that this 



expression equals n/4 + 1/2. Substituting this into Theorem 36 proves the claim. D 



Since Theorem 36 (which has been used to prove this result) is applicable in more general situations, there exist 
several possibilities to generalize the theorem above. For example, consider the totally symmetric or totally anti- 
symmetric subspace on N qudits, S+ := C" V C" V . . . V C", and S- := C" A C" A . . . A C", both as subspaces of 
AB = (C")®^. Consider the l-versus-(iV - l)-qudits cut, i.e. A = C" and B == (C")®(^-i). Then the situation 



satisfies the conditions of Theorem 36 for every unitary U on A, there is a unitary U' on B such that U ®U' S± ~ S±, 



namely U' := [/®(^ i). Thus, Theorem 36 can be used to compute the expected local purity for this cut. (We do not 
pursue this calculation here.) 

It is clear that the result of TheoremlTlabove can be proven in principle without the machinery of this paper, purely 
within quantum mechanics. However, we think it is important to have it derived within the framework of general 
probabilistic theories, showing the power and flexibility of this framework. Our general proof, as given in Section[lIl} 
is very geometrical in flavour; it treats the set of quantum states as a convex set, with the (anti)symmetric subspace 
as a face. Thereby, it shows very clearly what geometric properties of the quantum state space are important for the 
result to hold. 

Apart from the generalization to other theories that one obtains for free, this proof method also clarifies some 
aspects of the quantum result. For example, it shows why some further generalizations of the above result will need 
considerable further effort, such as computing the average local purity for the 2-versus-(A^ — 2)-cut on the totally 
symmetric subspace. If Alice holds two qudits, she can locally perform unitaries of the form U ® U . If, for any 
such unitary, the map U' :— JJ®^^^"^^ is applied on Bob's part of the state, then the totally symmetric subspace stays 
invariant. Since the group of unitaries U ®U acts irreducibly on Alice's subspace C" A C", the situations seems fine 



at first, and one might guess that Theorem 36 is easily generalized to this situation. 

But this turns out to be wrong: the important property is that Alice's unitaries should act irreducibly on her convex 
set of states, which is in general not the case. Instead, the group action p i-> [/ (8) UpW (E) U^ is reducible on the space 
of traceless Hermitian matrices over C^ V €?, and Alice's (Bloch) state space decomposes into invariant subspaces. 
This shows that the relevant question is not whether the group oiU ®U acts irreducibly, but w^hether it is a 2-design. 
In this case, the answer is negative. 

E. Statistical physics and the second law 

Our result can also be used to generalize an approach to thermod5mamics which has recently attracted a lot of 
attention [TF-^Sl. This approach is based on the fact that most pure quantum states are almost maximally entangled, 
in the sense described earlier in this paper. 

The main idea, as developed for example in IITSlI , can be stated as follows. We divide the universe's Hilbert space 
% into a small "system" and a large "environment", "H — Hs ® He- In many cases, the state of the universe is 
constrained to be an element of some subspace "H/j C H, which might be, for example, a subspace corresponding to 
a narrow window of energies. The maximally mixed state on T-Lr is called the "equiprobable state"ei^. The actual 
state of the universe is then assumed to be some unknown pure state \ip) from Hr. 

At first, it seems as if the exact form of the actual state {t/j) £ Hr would have profound consequences, and that 
very little can be said about the reduced state on the small subsystem, tps = Tr^; \ip){tp\. But this turns out to be 
wrong: in fact, "most" states \ip) look very alike on the small subsystem. That is, tjjs ~ Tr^ er with high probability 
for randomly chosen lip). This can be formulated as follows: 

Principle of Apparently Equal a priori Probability flSl: For almost every pure state of the uni- 
verse, the state of a sufficiently small subsystem is approximately the same as if the universe were in 
the equiprobable state er. In other words, almost every pure state of the universe is locally (i.e. on the 
system) indistinguishable from er. 

This principle is then used to justify the 'equal a priory probability' assumption, an assumption of Statistical Physics 
which is used in the derivation of many major results in that field. 

Our results can be interpreted in a similar manner First, consider the simple case where we have a small system, 
A (not necessarily quantum), coupled to a large bath, B, and where all global states are in principle possible. In the 
quantum situation, this corresponds to the special case where Hr = %. If we have a random state uj^^ on AB with 



purity Viuj"^^), and if the conditions of Theorem 29 are satisfied, we have for Nb ^ Ka 



^"^(" ^-KaKb-1- Na-1 ^^" )-if^(7v^-i)^(" ^ i^- 

If this is very small, then the state of the small subsystem is very close to maximally mixed. As discussed in Subsec- 
tion Inc}, the Markov inequality (or more powerful measure concentration inequalities) tell us that the expectation 
value is then also the typical value. In this case, the "Principle of Apparently Equal a priori Probability" is satisfied 
in our more general setting. 

It remains to see under w^hat conditions this expectation value is actually close to zero. There are two possibilities 
how this may happen: 

• We might have a random pure state, i.e. V{uj"^^) = 1, but Nb/ Kb might tend to zero with increasing size of 
the bath B. This is exactly what happens in quantum theory, where Nb is the bath's Hilbert space dimension, 
and Kb — N"^. The interpretation is that "most" pure bipartite states are almost maximally entangled, such 
that the local reduced state looks close to maximally mixed. 

It is interesting to see that the same phenomenon may appear in general probabilistic theories beyond quantum 
theory, and may in fact be stronger: there are natural possible classes of theories lE7l|28l where Kb = Ng for 
some integer r £ N. While we have r = 2 for quantum theory, other theories with r > 3 would have even 
"stronger-than-quantum randomization": they would have Nb/Kb — N]^'^ , turning faster to zero than the 
quantum value. 

• On the other hand, we may have Kb ~ Nb, or equality as in the case of classical probability theory. Then, we 
could still have randomization if 7'(a;'^^) tended to zero with increasing size of B, in situations w^here it makes 
sense to model the global state as a random mixed state. 

A situation like this is given in classical coin tossing, as discussed in Subsection II C there, V{uj^^) describes 



the purity of the unknown global initial state, before the coin is tossed in a random, reversible way. Larger 
environment usually amounts to less knowledge about its details, which means smaller purity V{uj^^). 

One may argue that a situation like this is also encountered in natural systems of classical statistical mechanics, 
if a small finite system is reversibly coupled to a large, unknown environment. 

In the quantum situation, the principle above is formulated for the more general case that Hris a proper subspace 
of H, and not all of the global Hilbert space. The analogue of this situation in general probabilistic theories would 
be to have the set of allowed states restricted to some face of the global state space AB. Our results do not directly 
address this situation in full generality, but Theorem [34] covers the special case of a GG' -invariant face F. Even 
though the resulting formula is not as transparent as the one above, it shows that the amount of randomization is 
also very strong if the face's dimension K-p increases with the bath B: since projections are contractions, we have 



Ka 






Ka-1 V (w^^) Ka - 1 NaNb - 1 ^, ab. 



Kf-l V{Lp^®iJLB) Kf-l Na~1 

whenever the conditions of Theorem [29| are satisfied (we have also used Lemma [21). If Nb/K^ tends to zero with in- 
creasing size of the bath (as is the caselor symmetric and antisymmetric subspaces in quantum theory), the Principle 
of Apparently Equal a priori Probability remains valid. 

A speculative, but interesting application of this result in the post-quantum case could be in black hole thermody- 
namics. The results on typical entanglement have already been discussed in the context of black hole entropy f3^, 
and quantum information analysis has been applied to learn more about the black hole information paradox 115,381- 
HOl . Since no fully complete and unique theory of quantum gravity is available yet, many parts of black hole ther- 
mod5mamics are subject to speculation. Vice versa, the assumption that the laws of thermod5n-iamics are valid for 
black holes is used to obtain information on properties of the possible underlying theory of quantum gravity. 

One may speculate that a possible theory of quantum gravity might not only involve a modification of the usual 
concepts of geometry and gravity, but also of quantum theory itself. It is possible that quantum theory is only an 
approximation to a different kind of deeper probabilistic theory, similarly as classical probability is only an approx- 
imation to quantum theory. The principle of equal a priori probability is closely linked to the second law, and one 



may view our results as a first step towards formulating the second law as a kind of meta-theorem that does not 
depend on the details of the theory and may thus apply to post-quantum theories. 

As further motivation for research in this direction we note that there is a striking historical precedent where as- 
suming the persistence of the second law helped to discover new physics. Planck [41 1 arrived at energy quantization 
(energy e = hv, where j/ is a frequency and h his constant), by implicitly assuming certain thermodynamical entropic 
relations would still be valid after the quantization of energy. 

F. Simple proof of the quantum case 

We now give a comparatively simple derivation of the value of t5^ical purity for the quantum case. The proof 
simplifies and generalizes the proof for the case of globally pure quantum states in tT3ll42l . Moreover, it gives some 



intuition on the necessary notions and ingredients for the general proof in Section III 

Firstly we note that the local purity is directly related to how well one can predict measurements of local outcomes. 
Phrased in these terms we wish to show that local measurements (of the form gA <E) Ib for some gA y^ ^a) tend to 
be highly unpredictable. We shall accordingly represent the state in a way w^hich makes it clear to what extent local 
measurements are defined. We will use a nice way of linking the Heisenberg and Schrodinger pictures, which is to 
expand the density matrix in terms of elements g of the Pauli group {X,Y,Z,1}'^": 



P 



-E^^. 



The sum contains 4" terms, which we label from i = to i = 4" — 1, such that g^ — \. The coefficients ^i are directly 
related to the expectation values of the corresponding Pauli element via 

(g,) = Tr(p.g,) = Tr(ie.) = T%. (5) 

In what follows we use the above representation to derive the expected purity value. We write the formula in a 
way that highlights that the ratio of the local to the total purity is proportional to the ratio of the number of local 
versus global observables. The intuition here is as follows. There is a certain limited amount of purity/predictability 
about the state, and this gets associated with observables picked at random (not independently) by the random 
unitary. If most observables are global, this predictability is then likely to be associated with global observables, 
while the remaining local ones become unpredictable. 

Lemma 2. Consider any quantum state (ponn = ua + tib qubits, with fixed purity Tr((^^). Apply a random unitary U to it, 
i.e. p :— UipU^. Then the expected local purity on subsystem A is given by 

E^Tr[(/)2]_2-"- _2„^ Ka-1 



Tr(^2)_2-" Kab-1' 

where Ka — 4"-* and Kab — 4" quantify the number of local (i.e. Paulis of the form gA (8) Is) and global degrees of freedom 
(all other Paulis), respectively. Note that 2^"-^ and 2^" are the minimal possible values of the purity of any quantum state on 
A resp. AB. 

Proof. Note that p"^ — Ttb{p) = Spauiiso; with ion b?« '^^'sCSi)' arid there are 4"-* such elements. This shows that 
(p^)^ — (2"^ )^ J2i=a ~ ^f^A, where i is the label of the Pauli operator gA'^^B- Consequently 



EuTilip^f] = 2"+"« J2 ^u^f = 2 



4"A_i 

n+UB 



4"A_i 



EuCo+ E ^u^'^ 



= 2"+"^ [2-2" + (4»A _ 1) E^^2j 



We have used the fact that Tr(p) = 1 = 2"^o => ^u^o ~ 2 2". Now consider two elements gt, gj with i, j ^ 0. Those 
elements are connected by some unitary operation V, i.e. gj = VgiV^ . Thus, 

E^ef = E[/ Tr [{ifU^g.Uf] - E^ Tr [{^U^Vg.V^Uf] - Ec;^- 

due to the unitary invariance of the Haar measure. Now we exploit the fact that TT:{ip'^) = l^r{p'^) = 

(„ ^ Tr('(Z3'^~) 2^ " 
J2i=i^ ^i + 2-2"). Taking the expectation value of this expression gives Eu£,f ~ 2^" • — — . This 

can be substituted into the expression above, proving the statement of the lemma. D 
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Some remarks: 

1. We see that Eu Tr [(p^)^] w 2("«"") = 2""-* when n > n^ > 1. This is the minimum value it can take. 

2. The purity condition Tr(p^) = 1 enforces the uncertainty principle, forcing most of the ^f to be small. 

3. Apart from the purity restriction, the observer is constrained by having access to only the Ka — 1 = 4"-* — 1 
local observables out of a total of Kab — 1 = 4" — 1. This ratio appears directly in the statement of the lemma. 

Perhaps surprisingly, the proof above can also be adapted to classical probability theory (with UA+ns classical bits). 
The only difference in the result will be that Ka and Kab have to be replaced by A'^^^ = 2"-* and K'^^^ = 2", in 



agreement with Subsection 11 B Remark 2. above suggests that this result is related to the absence of uncertainty in 
classical pure states. 

The proof above also illustrates some main ideas for the derivation of the general probabilistic result in Section [lll| 
A useful insight above was to consider the linear maps S,i = 6 (p)/ ^r^d to see that purity can in general be expressed 
as a sum over ^i(p)^. More specifically, the local reduced state's purity can be expressed as a sum over i,i{pf' for a 
certain type of ^/s, namely those which act locally: ^i{p) = Ti{p{g (E) I)). Since they are all connected by reversible 
transformations, they all have the same Haar expectation value. 

The general case will use a very similar construction, where the sum is replaced by an integral over the group. 



and the map ^i is replaced by a general "Pauli map" (cf. Lemma 13 1. In analogy to the quantum case, it turns out 
that the local reduced state's purity can be expressed as an integral over a certain type of Pauli map, namely one 



which acts locally (cf. Lemma 21 and the proof of Theorem 22 ) Again, due to invariance with respect to reversible 
transformations, all these maps in the integral have the sameHaar expectation value, giving rise to the proof of our 
main result. 



III. MATHEMATICAL FRAMEWORK AND PROOFS 

A. General probabilistic theories and the Bloch representation 

We work in the framework oi general probabilistic theories, a natural mathematical framework w^hich describes basic 
operational laboratory situations like preparations, transformations, and measurements. Quantum theory can be 
described within the framework, as well as classical probability theory and a large class of possible generalizations. 
For an introduction to this framework, and in particular for the physical motivation, see e.g. | |24ll28ll43l . Our notation 
is particularly close to |43| and |28|. 

A state space is a tuple {A, A+^w^), where A is a real vector space of finite dimension Ka {v^e will not consider 
infinite-dimensional state spaces in this paper), and A^ c A is a proper cone (that is, a closed, convex cone of full 
dimension which does not contain lines). It can be interpreted as the set of unnormalized states, u^ is a linear 
functional which is strictly positive on 71+ \ {0} and is called the order unit of A. The set of points lj g A+ with 
M"^(w) = 1 is called the set of (normalized) states and is denoted ^a- It follows that A+ — Ua>o ^^a and that ^a 
is a compact convex {Ka — 1) -dimensional set. Its extremal points are called pure states, the others are mixed states. 
Instead of the full tuple, we will usually just call A the "state space". 

A linear invertible map T : A ^^ A\s called a symmetry if T(yl+) = A+ and u^ oT = u^. That is, symmetries T 
map the set of normalized states VLa bijectively into itself. The example of a qubit shows that not all symmetries of 
a state space have to be allowed transformations: reflections in the Bloch ball are symmetries, but are not physically 
allowed since they do not correspond to completely positive maps. Thus, in order to define reversible d5n-iamics on 
a state space, we also have to specify a group Qa of (allowed) reversible transformations. For the sake of generality, 
we allow arbitrary choices of Qa, as long as Qa is compact and contains only symmetries. II60I Then, a pair (A, Qa), 
where A is a state space (equivalently: a tuple {A, Aj^,u^,Qa)) will be called a dynamical state space. Again, to save 
some ink, we will usually denote the dynamical state space simply by the letter A rather than by the full tuple. 

One goal of this paper is to investigate properties of random pure states on general state spaces. In order to have 
a meaningful mathematical notion of "random states", we need the following property: 

Definition 3 (Transitivity). A dynamical state space A is called transitive if for every pair of pure states a, a; € VIa there 
exists a reversible transformation T £ Qa such that Ta = oj. 
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Since Qa is compact, we have the notion of a Haar measure on that group [26]. Thus, we can draw a pure state by 
applying a random reversible transformation T £ Qa to an arbitrary given pure state lu. Transitivity is required so 
that the resulting distribution does not depend on the initial state u. The property of transitivity is thus a necessary 
mathematical prerequisite in order to have an unambiguous notion of "random pure states". 

It is also questionable whether reversible theories without transitivity re self -consistent in a specific physical sense. 
Imagine for example that given a product state there is no reversible way to transform it into a pure but correlated 
(and thus entangled) state. This is the case for the theory with PR-boxes known as boxworld [44J. Then one can for 
example not model a measurement as a reversible correlating interaction between a memory system and the system 
in question. The possibility of modelling measurement interactions in that way seems to play a fundamental role 
for the self -consistency of quantum theory and statistical mechanics (cf . Maxwell's demon and Bennett's reversible 
measurements). In accordance with the two justifications above we shall henceforth unless otherwise stated assume 
that we are dealing with transitive state spaces. 

Definition 4 (Maximally mixed state). If A is a transitive dynamical state space, let lo g O^a be an arbitrary pure state, and 
define the maximally mixed state ^l^ on A by 

Ai^ :^ / G{lo) dG. 
JGeGA 

It follows from transitivity that /i'^ does not depend on the choice of oj. Clearly, T/i"^ = fi^ for all T e Qa, and /^'^ 
is the unique state on A with this invariance property. The space A can be decomposed into a direct sum 

where K^"^ denotes the one-dimensional subspace which is spanned by fi^, and A is defined as the set of all vectors 

a € A with u"^ (a) = 0. 

Definition 5 (Bloch vector). Given any state uj € ft a (or, more generally, any point lo £ A with w^iuj) ~ \), we define its 
corresponding Bloch vector w as 

6j := UJ - fi^ e A. 

The set of all Bloch vectors lj with uj e Qa will be called Qa- 

Note that convex combinations of states yield the corresponding convex combinations of the Bloch vectors: 
(Si ^i'^i) = Si •^i'^i if Si ^i — 1- Every reversible transformation T £ Qa leaves ii^ invariant. Thus, we have 

(Tw)^ =Tuj- /i-^ = T{uj - n^) = Tuj, 

and applying a transformation T to a state is equivalent to applying it to the corresponding Bloch vector 

B. Definition and properties of purity 

We would like to define a notion of purity in generalized state spaces. In the quantum case, this is just 
Tr(p^) = {p,p), where {X,Y) :— Ti{XY) denotes the Hilbert-Schmidt inner product on Hermitian matrices. This 
inner product is very special - it is invariant with respect to the reversible transformations of quantum theory: 
{UAU', UBU^) — {A, B) for all unitaries U. Thus, it makes sense to ask for the existence of an analogous inner 
product in more general theories. 

Lemma 6. Let Abe a transitive dynamical state space. Then the following statements are equivalent: 

• There is a unique inner product {■,■) on A (up to constant multiples) such that all reversible transformations T £ Qa are 
orthogonal j|61l f. 

• Qa acts irreducibly on A; that is, A does not contain any proper subspace which is invariant under the action of all 
reversible transformations T e Qa- 



12 

Proof. This is a standard use of the (real version of) Schur's Lemma, see Proposition Vlll.2.3 in f26']. D 

If Qa acts irreducibly on A, we call the djmamical state space A irreducible. In other words: A is irreducible if and 
only if the only non-trivial subspaces which are invariant under all transformations of Ga are A and Rjj^. 

Transitive dynamical state spaces are not automatically irreducible. As a simple example, consider a state space 
CIa which is a cylinder (as in Figure 111 and where Qa contains all symmetries. The pure states are the points on the 
two circles. By rotation and reflection, every pure state can be reversibly mapped to every other, such that we have 
transitivity. However, it is not irreducible: the symmetry axis and the plane orthogonal it, intersecting the cylinder's 
center, are invariant subspaces. 





Figure 1: If CIa is a cylinder, then the corresponding state space is transitive, but not irreducible. 

Now it is straightforward to introduce a generalized notion of purity. 

Definition 7. Let Abe a transitive and irreducible state space, and let (•, •) be the unique inner product on A such that all 
transformations are orthogonal and (d, a) = Ifor pure states a. Then, the purity Viuj) of any state uj e VIa is defined as the 
squared length of the corresponding Block vector, i.e. 

V{uj) := \\iof = (w,w). 

It is straightforward to deduce some useful properties that follow from this definition: 
Lemma 8 (Properties of Purity). Let Abe a transitive and irreducible dynamical state space, then 

L 0<Viuj) < Ifor all lo e Qa, 

2. V{u}) — if and only ifuj — ^'^, i.e. ifto is the maximally mixed state on A, 

3. V{lo) = 1 if and only ifu is a pure state, 

4. V{Tlo) = V{ijj)for all reversible transformations T e Ga (md states w e f7^, 

5. ^/V is convex, i.e. 



\ 



v\Y.^Li^,] <£;a,\/-p(u,) 



\i=l 



1=1 



if Xi > 0, J2i ^i — 1' ^"'^ ^" uji £ Q 



A- 



Proof. First, 5. follows directly from the fact that ^JVifJ) = |lw|l is a norm (use the triangle inequality). That pure 
states cj have ViiS) — 1 follows directly from Definition[7| Since every state uj can be written as a convex combination 
of pure states, it follows from 5. that Viu)) < 1 for all states oj, and V{uj) = ||w|p > is clear We have proven 1. 
Clearly, from = V{uj) = \\oj\\^, it follows that oj = 0, so ui = ^i^, and this proves 2. Since the inner product was 
chosen such that reversible transformations are orthogonal, it follows that 



r{TLj) = {TLU,Tiu) = {LJ,Ld) = Viiu). 
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Now consider the ball B := {x £ A : {x, x) < 1}. If lj is any state with V{uj) — 1, then d) is on the surface of that 
ball; in particular, cD is an exposed point of the convex set B. But since a) G ^a and ^a C B, it must then also be an 
exposed point of 57^, hence a pure state. This proves 3. Note that it also proves that all pure states are exposed. D 

In general, our definition of purity only works for transitive state spaces. Unfortunately, in the case of bipartite 
(and multipartite) state spaces, this already excludes the most popular general probabilistic theory, colloquially 
called boxworld. As it turns out, there is a natural way to define an analogous notion of purity in boxworld, which 
we explain in Appendix IB] However, the resulting notion of purity does not have all the nice properties of Lemma [8] 
any more: in particular, it equals unity for some pure (product) states, but is necessarily less than one for other pure 
(PR box) states. 

In the quantum case, our definition of purity coincides with the standard definition up to a factor and an offset: 

Example 9 (Purity of Quantum States). The real vector space which describes the states on a quantum n-level system is the 
set of Hermitian complex n x n matrices, 

A = {MeC"''" I M = Aft}. 

The cone of unnormalized states is given by all positive matrices, while the order unit is the trace functional: 

A+ = {M eA\M> 0}, u'^{p) = Tr(p). 

Thus, the set of normalized states VIa is the usual set of density matrices; similarly, the Block vector space A is the set oftraceless 
Hermitian matrices. The group of reversible transformations Qa is the projective unitary group, 

GA^iU-U^ \ U^SU{n)}, 



and this group acts irreducibly on A (this follows from Lemma 43 in the appendix). Thus, there is a unique inner product on 



A such that all reversible transformations are orthogonal. It is easy to guess (we mentioned it before): it is the Hilbert-Schmidt 
inner product, scaled such that pure state Block vectors have norm 1: 

{L, M) := Tr(LM) (L, M e A), 

n — 1 

As a consequence, the purity V{p) of any quantum state p is 



n 



V{p) = (/5, p)^{p- 1/n, p - 1/n) - Tr(p^) -. (6) 

n — 1 n — 1 

Classical probability distributions can be treated in a similar manner: 

Example 10 (Purity of Classical Probability Distributions). The state space fls of a classical n-level system is the set of all 
probability distributions on n outcomes, that is, the simplex 

^B = < {pi,...,Pn) I P^ > 0,^Pi ^^\ ' 

This state space is contained in the vector space B = M" with order unit u^{p) := J27=iPifa^ P = (Pi; ■ • ■ ^Pn) ^ B. The 
cone of unnormalized states is 

B+ = {p= (pi, . . . ,p„) e B \ pi> for all i}, 

and the group of reversible transformation Qb is the permutation group Sn- The unique state which is invariant with respect to 
Qb is the maximally mixed state fj,^ = (^, ^, . . . , ;i-). It is a well-known fact of group representation theory !S^ that Qb acts 
irreducibly on B = {p <E B \ ^^Pi = 0}. The unique invariant inner product on B turns out to be 

n 

{P,q) ■= rV'pjg, {p,qeB). 

n — 1 ^-^ 

i=l 

Permutations relabel the entries ofp and q and preserve this inner product. Pure states p = (0, . . . , 0, 1, 0, . . . , 0) have {p, p) ~ 
1. Thus, the purity of a probability distribution ponn outcomes, using pi = pi — 1/n, is 

-Pip) = {p,p) = ^J2p' r- (7) 

n — 1 "^-^ 71—1 

i—l 
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This is the quantum result (I6} restricted to diagonal matrices (as expected); however, here it is derived without embedding the 
probability distributions into quantum state space. 

Example 11 (Purity for a gbit). Consider a generalized bit, or "gbit" i^f zvhere the state space CIa is a square as in Figureul 
It can be understood as describing "one half of a PR box" [24], and as a particular type of state space that appears in a theory 
called generalized nonsignaling theory or boxworld il44l f. 

We assume that the group of reversible transformations Qa is the group of all symmetries, which is consistent with the tensor 
product in boxworld [44]. Then Qa is the dihedral group _D4, containing all rotations of multiples of tt/2 and reflections 
through diagonals. This group acts irreducibly on A — M^. If we represent CIa us a square with jl^ = Oas the center, then the 
invariant inner product is given by the usual Euclidean inner product. That is, the contour lines of constant purity correspond 
to circles in the state space, see Fig^ 

C. Comparison to existing entropy measures 

The square state space (the "gbit"), mentioned in Example 11 and depicted in Figure l2l is also a good example to 
highlight a difference between the generalized purity used here and another possible generalization. 

In the quantum case, the standard purity satisfies the equation Tr(yO^) = 2^^'^^p\ where H2 denotes the Renyi 
entropy of order 2. There has been some work on notions of entropy in general probabilistic theories Il25ll45ll46l . 
One could now imagine to define the purity of some state uj as 2^^^'^'^\ 

However, such a definition would have undesirable properties, as can be seen by example of the definitions in f45l . 
Two possible definitions of H2 are considered there: one possibility is to define the measurement entropy H2{^) of 
some state w as the minimum Renyi-2 entropy of the set of outcome probabilities of any fine-grained measurement 
on uj. However, using the measurement entropy, the corresponding definition of purity would assign purity 1 to 
some highly mixed states in the boundary of the square state space, that is, the same value as for pure states. As an 
example, consider the fine-grained measurement on the gbit which consists of the two effects Ei{uj) :— ^/2ujy and 
E2{(-o) := 1 — \f2Gjy, if a) = (w^, Wj,) denotes the Bloch vector corresponding to lo (that is, the corresponding point in 
the square). The state w = (0, ^/^f^) is a mixed state in the boundary of the square, but the measurement {Ei, E2) 
assigns outcome probabilities and 1 to this state. Hence 2^^^'"^ = 1. This shows that the measurement entropy 
can be misleading if used as a characterization of the mixedness of a state 1162] . 

A second definition is the decomposition entropy, H2{uj), which is defined as the minimum Renyi 2-entropy of any 
probability distribution (Ai, . . . , A„) with n e N and J^i ^i — ^ such that cu = J^i ^i'^ir ^vith pure states LOi. That is, 
it is the minimal entropy of the coefficients in any decomposition of cj into pure states. However, as shown in lllSl 
(D8)], there are states oji,U2 in the gbit state space with the property that H2 (^wi + ^1^2) < 5-^^2(1^1) + ^H2{co2)- 
According to the corresponding purity definition, the mixture ^uji + \uj2 would have higher purity then both ui 
and 0J2, violating intuition about mixtures being "at least as mixed" as their components. In contrast, it follows from 
property 5. in Lemma^that our notion of purity V always satisfies V (|wi + \oj2) < max{7'(wi), P(a;2)}. 

Another advantage of our definition of purity, as compared to other approaches, is that it satisfies several useful 
identities arising from group theory. Thus, it is sometimes possible to calculate its value explicitly on the basis of 
simple properties of the state space (such as in Theorem [28|, which is important to derive the results of this paper. 
This is analogous to the situation in quantum theory, w^here purity is often used as an easy-to-calculate replacement 
for von Neumann entropy. 

D. Irreducible subgroups and generalized Paulis 

In the quantum case, there is a simple formula expressing purity in terms of squared expectation values of Pauli 
operators. For a single qubit, denote the 2x2 Pauli matrices by {Xo,Xi,X2,X3) :— {1,X,Y,Z), then Tr(p'^) = 

- y^ (TifXip)) . A similar formula holds in the case of several qubits; we discuss this below. 

i=0 

As it turns out, there is an interesting generalization of these identities to the general probabilistic case. To un- 
derstand the general case, it is useful to think of the Pauli operators not as matrices, but as linear maps that assign 
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Figure 2: The left and center panels display a "gbit" state space A, where Qa is a square and the capacity is Na = 2; shown are the four pure 
states and the maximally mixed state p, =0. The symmetry group is the dihedral group Qa = D4, and the contour lines of constant purity 
are circles (cf. Example pTJ). It has a complete set of Paulis, consisting of 2 maps which is the minimal possible number. The right panel shows 
a pentagon state space Qb- The group of symmetries is the dihedral group Qb ~ D5, containing rotations of multiples of a = 27r/5. This 
state space has capacity TVs = 2, a complete set of Paulis consists of 5 maps, and the maximally mixed state cannot be written as a uniform 
mixture of two perfectly distinguishable pure states, in contrast to the square. 



real numbers to states: p 1-^ Tr{Xip). These maps have certain properties that correspond to the conditions in the 
following definition: 

Definition 12. Let Abe a transitive state space. A linear map X : A^^^is called a Pauli map if 

• X{p'^) = 0/or the maximally mixed state p'^ e ^a, und 

• niax{|X(a)| \ a e A, {a, a) < 1} = 1. 

If X is a Pauli map, then there exists a vector X E A such that X{a) — {X,a) for all a E A. Thus, the second 

condition in Definition 12 is equivalent to the condition ||X||2 = 1, where ||^||2 = J (X,X) denotes the norm on A 

which is derived from the invariant inner product (•, •) on A. 

In the quantum case of a single qubit, it is easy to see that the maps p m- Tr{Xip) are Pauli maps if i e {1, 2, 3}, but 
not if i = 0: recall the definitions in Example|9l and let i e {1, 2, 3}, then 

p ^ Tt{X,p) =. Tt{X,p) = {X,,p) = 2 Tr{X,p), 

which proves that this map is represented by the vector (traceless Hermitian matrix) Xi — ^Xi e A, that is, one half 
times the corresponding Pauli matrix. Then, we have for the norm on A 

\\X,\\l = {X,,X,) = 2Tr(l2) ^ 2Tr 



which proves that the corresponding maps are Pauli maps in the sense of Definition 12 
Pauli maps are related to purity by the following lemma: 

Lemma 13. Let A be an irreducible transitive dynamical state space of dimension Ka, and let % 'EQAbea compact subgroup 
which acts irreducibly on A (for example, V. = Ga)- Then, if X is any Pauli map on A, 



{XoH{u)fdH^P^ 
Hen ^A — i 



for all states w e O^. 



Proof. If A/ > is any positive matrix on A, then / := /^^^ HMH^^ dH > satisfies [/, H] = for all H eU. Thus, 
by Schur 's Lemma, we have / = c • I^ for some c e M. By taking the trace of both sides, we see that c = Tr M/ {Ka ^ 1 ) ■ 

Since {XoH{uj)f = (X|iJ|c2;)(tj|iJ-i|X), we get 



Hen 



iXoHi.)f dH ^ iX\ ( l'^^^H\u.)iu.\H-UH) \X) ^ {X\^^Ia\X) ^ ^S^. 
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Note that in the quantum case, the vectors themselves are operators, hence the matrices and linear maps appearing 
in this calculation are superoperators. D 

The result becomes particularly interesting if the subgroup H C tj^ is finite: it will finally give the natural analog 
of Pauli matrices in more general theories. 

Corollary 14. Let A be an irreducible transitive dynamical state space of dimension Ka, and let % '^Q^bea finite subgroup 
which acts irreducibly on A. Fix any Pauli map Xi on A, and let X be the orbit ofH on Xi, disregarding the sign of each map. 
That is, X -.^iXioH \ H e n}/{+l, -1}. Then 

and we call X a complete set of Paulis /or A. Note that X{uj) = {X, lo). 

For a classical probability distribution p — (pi, . . . ,Pn), the expression Y^^=iPi is sometimes called the collision 
probability. It is directly related to our notion of purity V{p) by eq. llzl, and can be interpreted as the probability that 
two identically prepared copies of p give the same outcome, if the random variable i G {1, . . . , n} is measured. A 
similar interpretation exists in the quantum case: we can ask for the probability of getting the same outcome, if we 
measure two copies of a state p in a fixed basis. Maximizing this collision probability over all bases yields Tr(p^), 
with the maximum being attained in the eigenbasis. 

The following lemma generalizes this observation to other probabilistic theories. 

Lemma 15 (Operational Interpretation of Purity). Any Pauli map X can be interpreted as a measurement, giving outcomes 
±1 on a state uj with probabilities (1 ± X(w))/2. The corresponding expectation value is exactly X{uj). 

Denote by f"l^{X) the probability that two successive measurements of X on two identically prepared copies of uj give the 
same outcome ("c" is for "collision probability"). Then it turns out that 

max PZ{X) = Ul+V{u:)). 
X Pauli map ^ 

Proof. We use the Cauchy-Schwarz inequality 

1 + XHV , (\-X{u:)V _\ , , ^,,,,2N_1 , l/^,^^2/l , l„^n2 ,,,-,,,2 _ 1 



P^ W = y / ' ) + y ^ j = 2 (1 + ^H') = 2 + 2 ^^''^^^ ^ 2 + 2 "^"' ■ "'^"' = 2 (^ + ^(""^^ ■ 

This upper-bound is attained on the Pauli map corresponding to X := a)/||(I;||2. □ 

In quantum theory on k qubits, our notion of a "complete set of Paulis" reduces to the usual Pauli operators: 

Example 16 (Paulis on k Qubits in Quantum Theory). Recall the quantum situation described in Example^ but now on 
n — 2^ -dimensional Hilbert space, i.e. A is the quantum state space ofk qubits. A particular finite subgroup ofyA is given by 
the Clifford group ilJTl f 

Ck := {U e C/(2'=) I UPU^ e Pk for all P e Pfc}, 
ivhere Pk is the Pauli group on k qubits, i.e. Pk = {±cri ® . . . (g) Cfc \ en G {1,X,Y, Z}}. This group acts irreducibly by 



conjugation on A, the set oftraceless Hermitian matrices on (C^)®*^; we show this in Lemma 43 in AppendixiM Consider X'^'^, 
the k-fold tensor product of the Pauli matrix X. We would like to find a constant c > such that Xi{p) :— c ■ Tr{X^''p) 
becomes a Pauli map. First, we calculate the vector (i.e. matrix) Xi which describes Xi. 

X,{p) = c . Tr(X«V) = c ■ Tr(X«V) ^ (^i, P> = ^^ Tr(li/3), 
and we see that Xi — c{2^ ~ 1)2^'' X®''. The constant c is determined by normalization: 

1 = (^i.^i) - ^Tr (Xl) = ^ ■ ' %,, '^ Tr((X^^)^) ^ c\2^ ^ 1), 
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hence c = l/\/2'' - 1. Now if H ^ U ■ U^ with U € Ck, then 

By choosing appropriate elements H = W -U of the Clifford group, the matrix X®^ G P„ is mapped to every other element of 
Pn except the identity. Ignoring the sign as suggested in CorollaryW^ we get the orbit 



X=lp^ 



^J2^ - 1 



Tr(ai 



' CTfe p) I (Ti e {1, X, Y, Z}, not all a, = 1 



This is a "complete set ofPaulis" according to Corollary 14 these are the (maps corresponding to) the usual Pauli matrices. 
Therefore, purity can be expressed as 



1 



4k _ I 



E 



(Tr(ai 



"^kp)) _ V{p) 



2^-1 



4fc_ i' 



(<Ti,...,afc)#(l,...a) 

The standard result Tr(p^) = 2~^ Y^ Tr(o-i ® . . .® a kp)^ follows from some further simplification. 



For a classical n-level system introduced in Example 10 a complete set of Paulis consists of n maps that basically 
read out a probability vector's components: 



Example 17 (Paulis in Classical Probability Theory). Y^ith the notation of Example 10 let Xi : B ^ R be the map 

1 " 

Xiip) ■.= pi tEp*- 

n — 1 ^-^ 

1=2 

For the maximally mixed state p^ = (^, . . . , ^), zfe have Xi{p^) = 0. It is easy to check that Xi{p) = {Xi,p) for the 
invariant inner product on B if Xi — I , — , . . . , — I . Moreover, we have (Xi, Xi) = 1, hence Xi is a Pauli map 



according to Definition 12 Now let % ^ Qb = Snbe the full permutation group. If a e % is any permutation with, 
say, a{i) = 1, then Xi{p) := Xi o a{p) = Pi T^^Pj (for normalized probability vectors p e VLb, this is just 



j#i 



Xi{p) — ^^{~iPi ^ ;!"['*• '^^"s, a complete set ofPaulis X is given by the set of maps 



X ^ { p^ Pi- 






G {1, •■•,"} 



Then the formula from Corollary [l4| reproduces eq. (f^. 

Example 18 (Paulis for Polygonal State Spaces). Consider state spaces which are regular polygons; that is, Cl is a regular 
n-gon inscribed in the unit circle as in Figure 2] Then complete sets ofPaulis (in the sense of CorollaryW^ look very differently, 
depending on properties of the symmetry group _D„, the dihedral group. We illustrate this for the cases n — 4 and n = 5. 

Let Xi := (1,0), such that the corresponding Pauli map acts on the Bloch space M^ via Xi{u}) = (Xi,a)) = cji, if 
Cb = (wi, a;2). First, consider a "gbit" system A where the Bloch representation of state space, (Ia, is a square, inscribed into a 
unit circle as in Figure^ The symmetry group is the dihedral group D4; its orbit on Xi consists of the maps 

{Xi oG \ G e D4} = {(a;i,a;2) ^ ^1, ('^1,1^2) ^ -^1, ('^i-'^2) ^ ^2, ('^i,'^2) ^ -^2}- 

Disregarding the sign, we get a complete set ofPaulis in the sense of CorollaryW^ which is X = {Xi, X2 }, where Xi (w) — wi 
and X2{io) ~ W2- Since Ka — 3, the formula from CorollaryW^becomes 



, 2 2n 'Pi^) 



for all Lo e ^A , 



(8) 



which just expresses the fact that the purity equals the squared Euclidean length of the Bloch vector. 
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Nozv consider the case n 
given by H := {i?(2fc7r/5) 



- 5, that is, a state space B where CIb is a regular pentagon. A smallest irreducible subgroup is 
/c = 0, 1, . . . , 4} (denoting its action on Bloch vectors), where R{a) denotes rotation by angle a 



in R^. In general, {Xi o i?(2fc7r/5)(a}))^ gives dijferent values for all k, which means that a complete set ofPaulis necessarily 
contains all the five maps. Thus, all we get is 



k=0 



V{u) 



for all uj e Q.B- 



(9) 



In a sense, this is "inefficient": in order to compute V{uj) = ||w||^, we could as well use eq. ([8), which involves only two 
addends instead of five. However, the advantage o/(|9j compared to (jsj is that all involved maps Xk :— Xi o i?(2A:7r/5) are 
"equivalent" for the state space B: they are all connected by reversible transformations. In other words, in order to build a 
device that measures Xk, it is sufficient to have a device measuring Xi. All other measurements can then be accomplished by 
composing Xi with a reversible transformation T^ :— R{2k'K/5), as sketched in Figure^ Within state space B, this is not 
possible for the two maps Xi{uj) :— wi and ^2(0;) :— 0)2 that appear in eq. (Isj. 



x^ = — 



Tk 


— 



Figure 3: All elements Xk of a complete set of Paulis (in the sense of Corollary 1 14^ on a general state space are connected by reversible 
transformations. That is, every Xk can be measured by first applying a reversible transformation Tk, and then measuring a fixed Pauli Xi. 



E. Bipartite systems: local purity as an entanglement measure 



The main goal of this paper is to investigate typical states on composite state spaces. When we have state spaces 
A and B, there are m general many different possible ways to combine them into a joint state space AB. However, 
there is a minimal set of assumptions that necessarily must hold m order to interpret AB as a "jomt state space" m a 
physically meaningful way. The most important assumption is no-signalling: measurements on one subsystem do not 
affect the outcome probabilities on other subsystems. In this section, we make an additional simplifymg assumption 
w^hich is often (but not always) imposed in the framework of general probabilistic theories: that of local tomography. 



However, we will later drop this assumption in Subsection III H 



Assumption: Local tomography. If A and B are state spaces, then the joint state space AB has the property that 

states w^^ e ^AB «?"£ uniquely characterized by the outcome probabilities of the local measurements on A and B. 

From a physics point of view, this assumption means that the content of bipartite states consists of the correlations of 
outcome probabilities of local measurements. This is equivalent to the multiplicativity of the state space dimension: 
Kab = KaKb- It can be shown [|24] that this assumption implies the tensor product formalism: The linear space 
which carries the global unnormalized states is the algebraic tensor product of the local spaces: AB — A® B. We 
have the notion of product states uo^ ® uj^ G ^ab for states lo^ G 51^, uj^ £ Hb/ and similarly for effects, with the 
same interpretation as in the quantum case. In particular, the unit effect on AB is u^^ 
(jj^^ e ^AB, we can define the reduced state to^ e ^a by L^{lo^) 
(in particular, for all effects). 

In accordance with ||24| , we give a list of additional assumptions that naturally follow from the physical interpre 



u^ ® u^ . For global states 
L^ ® u^{lo^^) for all linear functionals L^ on A 



then we assume that uj^ ® uj^ 



e ^AB- That is, 
^^ e ^AB is any global state. 



tation of a composite state space. First, if oj e ^a and u^ e VLb, 

we assume that it is possible to prepare states independently on A and B. Second, if i 

we assume that the local reduced states are valid states on A and B: w^ G ^a and u)^ e ^b- Since this work is 

on dynamical state spaces, we also postulate that reversible transformations can always be applied locally. That is, 

Qa®Qb'^ Gab- 

Similarly as in the quantum case, a global state uj^^ e ^ab ^vill be called entangled if it cannot be written as a 
convex combination of product states. Now suppose uj^^ is pure. 

• If the purity of the local reduced state is one, i.e. V{ii 



that Lj^^ 



ui 



(g) CO - that is, the global state is unentangled. 



1, then uj^ must be pure. From this, it follows 
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• On the other hand, if ■p(u;'^) < 1, then o;^^ cannot be written as a product uj^ ® uj^, since uj^ would necessarily 
have to be pure. Thus, uj^^ is entangled. 

That is, the local purity V{uj"^) can be understood as an entanglement measure: the smaller V{uj^), the "more entan- 
gled" u}"^^. If V{uj^) = 0, or equivalently uj"^ — /i"^, we may call uj^^ maximally entangled. It turns out that a PR box 
is an example of a maximally entangled post-quantum state in this sense lllll . 

It is natural to ask for the typical entanglement of random pure states on a composite state space AB. As discussed 
in the context of Definition [3] above, in order for this notion to make sense, we need the property of transitivity: for 
every pair of pure states a, w e ^^ab, there must be a reversible transformation T E Gab such that Ta — uj. It is 
important to note that transitivity of the local state spaces A and B does not imply transitivity of the joint state space 
AB. A simple example is given by a state space called "boxworld" 1.44.1 : suppose that A and B are both square state 
spaces as in the left of Figure 12} and AB is the state space which contains all no-signallrng behaviours (including, 
for example, PR-box states). That is, ^Iab is assumed to be the largest possible subset of AB that is consistent with 
the assumptions mentioned above {^ab is sometimes called the "no-signalling polytope", or the "maximal tensor 
product" of the local state spaces). Then it turns out that the global state space is not transitive: for example, no 
reversible transformation takes a pure product state to a pure PR-box state. 

Thus, in the following, we will only consider composite state spaces AB that are themselves transitive. As a first 
observation, it turns out that the maximally mixed state on AB is the product of the maximally mixed states of A 
and B. The proof is given in Il49l . 

Lemma 19. If A, B, and AB are transitive dynamical state spaces, then ^^^ = ^^ ® ^^ . 

If AB is transitive, we can decompose it into the Bloch subspace and multiples of the maximally mixed state: 
AB = (AB)^ 0M/i"^^. On the other hand, if A and B are transitive, we can substitute their local decompositions into 
the tensor product: 

AB = A(E)B={A® M^^) (x) {B ® M^^) = (i (g) B) ® (i (g) fj,^) {fi^ (x) B) (B Rfi^^ . 
Since u^^ — u"^ (^ u^ , the unit effect is zero on the first three addends in this decomposition. This shows that 

(AS)^ = (i(gB)e(i(g)^^)®(^^®B). (10) 

This decomposition is reminiscent of another "Bloch representation" Il49l l50ll which writes global states in terms 
of three vectors: the two local reduced states, and a correlation matrix. Now suppose that, in addition, AB is 
irreducible. Then there is a unique inner product on (AB)^ such that all transformations T E Gab are orthogonal. 
Moreover, the three subspaces in eq. | [T0| are preserved by local transformations Ta®Tb, and they are mutually 
orthogonal. To see this, first note that for pure states ^ E VIa, we have /g, ^g Ga'4' AGa = Jq gg Ga^^ dGA — fJ.^ = 
0. Since the pure states span A, the tp span A, and this integral is zero for all vectors tp E A. Now mutual orthogonality 
of the subspaces, for example A^ B J- A(E) /i^, follows from 

{a(g>b,a (g> fi^) = {{lA(E)GB)ia(g)b),{^A'^GB){a (E) ^J.^)) = {a(g)GBb,a (g) ^J,^) ^ {a(g) GBb,a (g) n^) (IGb 

J GbSiQb 

= (a(8)0, a'«)/x^> == 

(the other pairs of subspaces can be treated similarly). The value of the inner product on A® ii^ (and similarly on 
fi^ (g) B) can be calculated explicitly: 

Lemma 20. Let A, B, and AB be transitive dynamical state spaces, where A and AB are irreducible. Then 

{x<g)^i^,y<g)li^) ^'Piip'^(^^i^){x,y) forallx,yE A, 
where ip^ is any pure state on A. 

Proof. For any pair of vectors x,y E A, define {x, y) :— (a; g) ^^, y (g fj,^). Clearly, this is an inner product on A. 
Moreover, it is invariant with respect to all reversible transformations on A. Explicitly, for all T E Ga> 

{Tx, Ty) = {Tx (g)fi^,Ty® fi^) = ((T (g !)(£ g) ^^), (T (g I)(y g) ^^)) == (x g) ^^, y g) /x^) = {x, y), 
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since local transformations are in particular orthogonal with respect to the global invariant inner product. According 
to Lemma^ it follows that there exists some global constant c > such that {x, y) = c{x, y). Choosing x 

for some pure state ip^ e VIa shows that c = {tp"^, ip^) = {ip^ ® ^^ ,ip^ <g) ^^). But 



2/ = (^-4 



{<P^ (g> m"")^ = {p^ + A^^) (^ ^l'' - ti^ (E> ^l'' = p^ (E> A^^, (11) 

hence c = {{p^ (g) ^jl^)^, [p^ ® ^^)^) = V{p^ ® n^). D 

We would like to construct Pauli maps on AB from Pauli maps on A and B. In particular, if X"^ is a Pauli map 
on A, a natural idea is to use the map X^ igD u^ on the global state space. Is this a Pauli map? First, we have 



X (g) u (fi ) = X {fi )u {fi ) — 0, so the first condition of Definition 12 is satisfied. But there is a second 
condition, demanding that the vector {X^ (g) u^)^ representing this map must be normalized. As it turns out, this 
vector has norm larger than one in general. The following lemma says how the map has to be normalized in order 
to obtain a Pauli map on AB. 

Lemma 21. Let A, B, and AB be transitive dynamical state spaces, where A and AB are irreducible. If X^ : A ^ "^is a 
Pauli map on A, then the following identity holds and describes a Pauli map on AB: 



X^®u^ 



= ^V{p'^(E)H^)X'^ (g) u^, 



where p^ is an arbitrary pure state on A, and ^^ is the maximally mixed state on B. 

Proof Let c := ||(X^ ® u^Th and X := ^X^ ® u^ , then clearly ||X||2 = 1 and X{ii'^'^) = \X'^{ii^)u^{^i^) = 0, 



so X is a Pauli map according to Definition 12 It remains to show that ^ = 'Pip^ ^ M^)- Recall the decomposition 
of (AB)^ from eq. 1 10 1. The functional X"^ g) u^ acts as the zero map on A g) _B and ^i"^ B, hence it achieves its 
maximal value on unit vectors on the subspace A (g /i^. Thus, by elementary analysis, 

c== X^gju'")'^ 2 = max J — — ^l-^ == max '- „ ,„ ^^^' = max J — \,, ^ '\ 

ve(Ai3)A\{0} \\p\\2 ■^eA«.M^\{0} ll<y^l|2 aei\{0} \\a®iJ,^\\2 



According to Lemma 20 we have ||a ® M^IU — \/{o- ® ^^ ,a<g) /i-^) = \/V{p>^ g) M"^)ll^ll2/ hence 

1 \X^{a)\ lll^i 



max 



2 



^V{ip^®[i^) aeA\{o} ||a||2 ^JV{p^ g) \iP) ^JV{p^®ijP) 

This proves the claim. D 

Suppose we draw a pure state w^^ e ^ab at random. This can be alternatively understood as a two-part process: 
first, we fix an arbitrary pure state a^^ G ^ab- Then, we apply a random reversible transformation T e Qab to 
it (drawn according to the Haar measure): the result lo^^ = Toc^^ will be a random pure state. Similarly, we can 
fix a mixed state a^^ G ^ab with V{oA^) = Vq < \, and apply a Haar-random reversible transformation to it: 

^AB ^ rj.^AB_ 

Having an initially mixed global state describes, for example, classical coin tossing, with A the coin and B the 
environment. We will loosely describe this situation as "drawing a random state a;"^^ of purity Vq := V(a^^) = 
V{uj^^)", but this description is not quite correct: there is no natural invariant measure on the set of all states with 
fixed purity Vq < 1, since those states are in general not all connected by reversible transformations (an obvious 
example is given by the square state space in Figure Eb. Thus, not all properties of the random state uj^-^ will be 
independent of the initial state a"^^. However, as we snail see, the expected local purity will be independent of a^^, 
and this is all we are interested in here. 

Theorem 22. Let A, B, and AB be transitive dynamical state spaces, where A and AB are irreducible. Draw a state u^^ e 
^AB of fixed purity V{ijj^^) randomly. Then, the expected purity of the local reduced state lo^ is 



KaKb-1 Vip>^(^iiB)' 
where p^ is an arbitrary pure state on A, and ^^ is the maximally mixed state on B. 
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Proof. Let X^ be any Pauli map on A, then X :~ \JV(}p^ ® fi^) X^ ® u^ is a Pauli map on AB according to 
Lemma [2T| Using the invariance of the Haar measure and Lemma 13 we calculate 

E^^^ = E. / {X^ o G{uj^))'' dG ^ f E^(X^®?/^(G®I(c^^^)))'dG 

J^A - 1 JgeQa JgeGa 

= E„ {X^ ® u^(w^^))' = / E^ {X^ ® u^(Gcj^^))' dG 



E„ / {XoG{uj^^)f dG^ 

^ GGG a r 



The symbol E-^ denoting the expected value with respect to lo disappears since u^^ is drawn from a uniform distri- 
bution of a set of states with fixed purity (as described before the lemma). D 

It is easily checked that this result contains the well-known quantum result that random pure bipartite states are 
almost maximally entangled with high probability if \B\ ^ \A\, but we will not demonstrate this here. Instead, we 



ask whether the expression Viip^ ® /i^) appearing in Theorem 22 can be simplified. It turns out that this is possible 
under some additional assumptions, and that this expression is related to the information carrying capacity of the 
involved state spaces. This will be shown in the next subsection. 

F. Classical subsystems and capacity 

How can we quantify the ability of a system (or state space) to carry classical information? A classical bit A and a 
quantum bit B both carry one bit of classical information, even though the state space dimensions are quite different: 
Ka = 2, while Kb = 4. The relevant quantity turns out to be the maximal number of perfectly distinguishable 
states 128] , denoted N. In order to define it, we have to talk about measurements. 

Single measurement outcomes on a state space A are described by effects, which are linear maps E : A ^ R with 
the property that E{lj) > for all uj e A+. The set of all effects [63| is known as the dual cone A^ of the cone of 
unnormalized states A+ in convex geometry fST]. An n-outcome measurement is a collection of effects Ei, . . . ,En that 
sum to the order unit: J2i=i ^i = ^'^- The probability of obtaining outcome i on state w G O^ is Ei{uj). 

Definition 23. Let A be any state space. 

• A set of pure states wi, . . . , a;„ e Q.a is called a classical subsystem if there is a measurement Ei, . . . ,En such that 
Ei{ujj) — 5ij (which is Ifor i = j and otherwise); that is, if the states are perfectly distinguishable by a single-shot 
measurement. 

• The capacity Na is defined to be the maximal size of any classical subsystem of A. 

1 " 

• If A is a transitive state space, then a classical subsystem wi, . . . , a;„ will be called centered if — y~^ a;, — ^i^. 

• If A is a transitive dynamical state space, then a classical subsystem wi, . . . , a;„ will be called dynamical if for every 
permutation tt on {1, . . . , n}, there is a reversible transformation Tj^ G Ga such that TT^{ijJi) = ui^^^i-^for all i. 

A "classical subsystem" is a subset of a state space which, in many respects, behaves like a classical system from 
probability theory. For example, given orthonormal vectors \ipi) , . . . , \ipn) G C* with (ipilt/jj) ~ (5^, the corresponding 
quantum states uji := |V'i)(V'i| constitute a classical subsystem. It is centered if and only if n = d, and the quantum 
state space capacity is its Hilbert space dimension d. A classical subsystem is dynamical if it also carries all of the 
reversible d5rnamics of classical probability theory - that is, all the permutations. This is clearly the case in quantum 
theory, where every permutation of the orthonormal basis vectors can be implemented by a unitary transformation. 

Is is easy to see that to any set of mixed states wi , . . . , w„ with effects Ei, . . . ,En such that Ei [ujj ) — Sij, there exists 
a set of pure states uj[, . . . jUj',^ such that Ei {uj', ) — Sij . Thus, the requirement of purity in this definition introduces no 
restriction. Here are some simple consequences of this definition: 

Lemma 24. We have the following properties of capacity and classical subsystems: 

(i) Capacity satisfies Na < Ka, and we have equality if and only if A is a classical state space, i.e. CIa is a simplex. 
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(ii) IfuJi, . . . .ujnisa centered classical subsystem, then necessarily n — Na- 
(Hi) If A and B carry centered classical subsystems, then so does AB, and we have Nab — N^Nb- 
(iv) If a dynamical classical subsystem contains the maximally mixed state in its affine hull, then it is centered. 

Proof, (i) It follows from the definition that sets of perfectly distinguishable states are linearly independent. Since the 
number of linearly independent vectors is upper-bounded by the dimension Ka, this proves that Na < Ka- Now 
suppose we have equality, then the perfectly distinguishable states wi , . . . , iuj„ with n = Na are a basis of A. Every 
state CO e flA can thus be written u — J2i=i ^i^i with a^ e M. Since u^{uj) — I = u^{LOi), we get J2i=i ^i = 1- 
Moreover, we have < Ej{uj) = X^i^i oiiEj{uji) = Uj. That is, uj is in the convex hull of loi, . . . , a;„; in other words, 
Q,A is the simplex generated by the cj^. 

(ii) Clearly, Na > n. In order to see the converse inequality, let a i, ... , a^^ be a maximal classical subsystem with 
corresponding effects Ei, . . . , E^a ■ Due to transitivity, for every k, there is a reversible transformation Tk € Ga such 
that TkUti = ak- Using the invariance of the maximally mixed state, we get 



\uji) = -Ek{ak) = -■ 
n n 



/ n \ ri 

Ekifi"^) = EkiTkfi^) = -Bfe Tfc- Vc^, = - V EkiTkUj,) > -Ek{Ty 

\ n ^ — ' / n ^ — ' n 

^-^ Na 
On the other hand, we have 1 = \J Ek{n^) > . This proves that Na < n. 

k=l 

(iii) If {ujfjf^i and {w^}^i are centered classical subsystems on A and B respectively, then all states cof (g) uj^ are 
pure. Moreover, they are perfectly distinguishable by the corresponding product measurement, and 




E^^^^'^f = hv7E'^n® hv;;E'^f =f^^^f^^ = f^ 



AB 



NaNb 

Thus, {ujf i^f }i j is a centered classical subsystem on AB of size NaNb, and it follows from part (ii) that Nab = 

NaNb- 

n 

(iv) Suppose that wi, . . . ,a;„ is a dynamical classical subsystem on A, and fi^ — 2. fi'^i for some real numbers 

Ti e M with X]"=i '^j = 1- Let _Ei, . . . , £'„ be the corresponding effects with Ei{ujj) — S^j. Let j,k e {!,..., n} be 
arbitrary, and it a permutation with k — tt^^{j). Then 

n n n n 

^J = E^'^y = E^'-^j(^') " £^j(Ai^) = EjiT^fi"^) ^^nEjiT^u,) = ^r,£:j(w^(i)) = r^-i^) = rfc. 

2—1 i— 1 i— 1 i— 1 

Thus, all ri are equal, and since Y^^=i ^i ~ 1' we must have r^ = ^ for all i. This proves the claim. D 

Not every state space carries a centered classical subsystem. This is illustrated in Figure [2] both the square state 
space A and the pentagon B have capacity Na ~ Nb = 2. Any pair of antipodal pure states of the square constitutes 
a centered classical subsystem of A, but the pentagon does not possess any centered classical subsystem. A polygonal 
state space with n > 4 sides carries a centered classical subsystem if and only if n is even. 

Why is it natural to assume the existence of a centered classical subsystem? We will now discuss three good 
reasons for a centered classical subsystem to exist in physically relevant state spaces. A first motivation comes 
from d5n-iamical considerations in group theory. Consider a qubit. The north and south pole (say, oji = |0)(0| and 
UJ2 = |1) (1|) constitute a classical subsystem - and it is one with rich d5n-iamics: we can do "classical computation" 
in this subsystem, that is, implement all the permutations (which is just a bit flip in the case of a qubit, but involves 
many more transformations for higher-dimensional quantum systems). 

More generally, we may ask what transformations preserve this classical subsystem. Together with the bit flips, 
these are the rotations around the z-axis, and there are many of them: only the maximally mixed state (and no other) 
is preserved by those transformations. It turns out that this property forces the classical subsystem to be centered: 

Lemma 25. Let wi , . . . , a;„ fee a classical subsystem on a transitive dynamical state space A, and let Q^^ C Qa be its stabilizer 
subgroup. If the maximally mixed state ^-^ is the only Q^-invariant state, then the subsystem is centered. 
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Proof. Every T £ Q^ preserves ^^ :— ■^Y^^=i^i- ^^ the lemma's condition is satisfied, this must be the maximally 
mixed state ^■^. D 

This means that if the state space is symmetric enough to allow for a rich group of dynamics (leaving the classical 
subsystem invariant), and if that subsystem is "large" enough such that the corresponding group "mixes" basically 
all of state space, then the subsystem must be centered. 

As a second motivation, consider any maximal classical subsystem wi, . . . , lum- We can think of the convex hull 
conv{aji, . . . , cjat} as a classical state space (a simplex) embedded in the more general, larger state space. This simplex 
carries its own "classical" maximally mixed state, w^hich is ^classical ._ i_ ^^^^ (^^ xhe property of being centered just 
means that this classical maximally mixed state equals the maximally mixed state of the larger theory, ^<=iassicai _ ^. 
classical probability theory is embedded in a "symmetric" way. 

From a physics point of view, this is to expect w^henever we have some kind of "decoherence mechanism" which 
effectively reduces observations to the embedded classical system. On an n-level quantum system, for example, 
decoherence can effectively reduce the observable state space to that of an n-simplex, w^hich corresponds to diagonal 
density matrices in the Hamiltonian's eigenbasis. Now suppose that decoherence has taken place, and in addition, 
we have total ignorance about the classical state of our system, such that we hold the state pt^iassicai 

Physically, we expect that we are left with no remaining information at all: if we have perfect decoherence, followed 
by perfect classical ignorance of the state, there should be no more remaining information that we could read out by 
measurement. This implies that ^(dassicai _ ^^^ ^^^ jg^ ^^ existence of a centered classical subsystem. This subsystem 
determines a "preferred basis" for decoherence. 

A third, more operational way to understand this property is a principle] 52 1 of "information saturation": suppose 
that Alice obtains a message i E {1, . . . , N} randomly, with uniform distribution. She encodes this message into the 
state oji of the state space's maximal classical subsystem, and sends it to Bob. The principle of information saturation 
asserts that Alice can use this to send the message i to Bob with perfect success probability, but not more. This amounts 



to saying that the mixed state that she effectively sends, ^ Tlii=i ^i' should be the maximally mixed state /i of the 



ig rnar rne mixea srare rnar sne eiiecnveiy senas, ^ 2_ 
theory. 

Before turning to the main result of this section, we need to consider one more property of state spaces. So far, 
we have mainly talked about classical subsystems on single state spaces. However, if we are interested in classical 
subsystems on composite state spaces AB, we expect that our theory can imitate another computational feature of 
classical probability theory: that dynamical classical subsystems on A and B combine to dynamical classical subsys- 
tems on AB. In other words, we expect that AB carries a dynamical classical subsystem which can be decomposed 
into A- and i? -parts. 

Definition 26 (Composite Classical Subsystem). A composite transitive dynamical state space AB is said to carry a com- 
posite classical subsystem if there are centered dynamical classical subsystems w^, . . . , w^^ on A and ojf , . . . , uifj^ on B 
such that the corresponding classical subsystem containing the states ujf^ := wf (g) uj^ is dynamical. 



We know from Lemma 24 that the states ujf^ are automatically a centered classical subsystem, and Nab = NaNb. 
However, it is not autornatically clear that all permutations on this classical subsystems can be implemented re- 
versibly, that is, that this classical subsystem is dynamical. If it is, it will be called a composite classical subsystem. 

Intuitively, this means that A and B contain classical probability distributions as subsystems, in the "friendliest" 
possible way: all permutations can be applied; the classical states of AB are combinations of those of A and B; the 
local "classical" maximally mixed states correspond to the maximally mixed states of A and B. The philosophy of this 
assumption is that physical state spaces should always be generalizations of classical probability theory, reducing to 
the latter in the case of decoherence. 

Centered dynamical classical subsystems have a nice symmetry property: 

Lemma 27. Let {uji, ... ,ijjN} be a centered dynamical classical subsystem on some state space. Then {uji.ujj) = — — for 

all i ^ j. 

Proof. By definition, for every permutation tt on {1. . . . , N}, there exists a reversible transformation T^r such that 

Tj^uji = LdTr(i). Hence {uji^ujj) = {TT^uji, Tt^Cjj) = (a)^(i), d)7r(j)). This proves that there is some constant ^ e M such that 
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{Cbi, Ljj) — ^ for all i =/= j. Now use the fact that the classical subsystem is centered: 

N I N N \ 

This equation can be used to infer that S, = —1/{N — 1). D 

Now we are ready to prove the main result of this subsection. 

Theorem 28. Let A, B, and AB be irreducible, and suppose that AB carries a composite classical subsystem. Then 

Viip"^ ® fi^) — for every pure state tp^ e Q.a- 

NaNb — 1 

Proof. By definition, there are centered djrnamical classical subsystems a;;j*, . . . ,aj^ on A, and wf , . . . ,a;^ on B 
such that the states wA^ := ojf ® ojf constitute a centered dynamical classical subsystem on AB. We know from 
Lemma|27|that(w,^/,a;^f) = -l/(Ar^iVB-l) if (i, j) ^ {k,l). Decomposing^^, we get a;^ 0/1^ = j^Ef^i^i'^^f 
and thus (wf (g) ^^)^ = ^ T.'j^M'^i ® ^f )^- Consequently, 

Nb 

Some simplification completes the proof. D 



Substituting Theorem 28 into Theorem 22 proves 



Theorem 29. Let A, B, and AB be irreducible, and suppose that AB carries a composite classical subsystem. Draw a state 
u)^^ e VLab of fixed purity 'P{uj^^) randomly, then 



This is the sought-for specialization of Theorem 22 Both Theorem 22 and Theorem 29 give explicit expressions 
for the expected local purity of random bipartite states. While Theorem 22 is more general (it does not assume the 
existence of a composite classical subsystem), it has the disadvantage oFcontarnrng a term Viip^ ® fi^) with no 



simple operational meaning. The statement of Theorem 29 is operationally simpler, but makes stronger assumptions 
on the state spaces. 

Note also that a further simplification may be made in the case where K = TV" for some integer r, a class of theories 
discussed in |27, 28 1. Then r becomes the only parameter that determines the expected purity of a subsystem and 
Ei^V{u}^) ~ N^~^ (where the approximation is good ii N :^ 1 for all systems/ subsystems under consideration). 

G. GG' -invariant faces: entanglement in symmetric subspaces 

So far, we have computed the expected amount of entanglement (that is, the purity of the local reduced state) only 
for the case that we draw the initial pure state from the full state space AB. In many cases, however, it is useful to 
consider drawing random states under constraints. As a paradigmatic ensemble, suppose we draw a random pure 
quantum state \ip) from the symmetric or antisymmetric subspace of C" (E) C". What can we say about the expected 
local purity in this case? 

We will see that Hilbert subspaces correspond to faces of the state space in the sense of convex geometry. This 
will enable us to compute the average reduced purity with geometric methods, using the invariant inner product 
introduced in earlier subsections. Moreover, both symmetric and antisymmetric subspace are invariant under all 
transformations of the form U ®U . This behaviour is a special case of the following general-probabilistic definition. 
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Definition 30 (GG'-invariant face). Let AB be a composite dynamical state space. A face F of CIab will be called GG'- 
invariant if for every G & Qa there is some G' e Qb such that G <S!) G' maps F into itself. The stabilizer subgroup {G e 
Qab I GF = F} will he called Q^. The face F will be called transitive if for every pair of extreme points (pure states) a, w e F 
there is some G € Gw such that Ga = uj.If¥ is transitive, we define the F-maximally mixed state /^f us 



lif := / Guj dG, 

JGeGw 

where w is any pure state in F. For every to £V,we set Co := lo — fif, and ¥ :— {uj \ uj £ ¥}. ¥ is called irreducible ifQw acts 
irreducibly on ¥. 

Note that GG' -invariance is not a symmetric notion: if for every G, there is some G' such that G ® G' stabilizes F, 
then it is not necessarily the case that to every G' , there is some G such that G (S> G' stabilizes F. 

Example 31. Here are some examples of transitive irreducible GG'-invariant faces: 

• The symmetric subspace Fsym on n-level quantum systems A and B.If-K is the projector onto the symmetric subspace, 
then Fsym = {p \ Tr(p7r) ~ 1}. This shows that Fsym is in fact a face of the state space on AB. IfG = U-WeGA is 
some unitary transformation, then G ig) GFsym = IFsym/ so it is GG'-invariant with G' = G. 

There is a one-to-one correspondence between the symmetric subspace and the Hilbert space % :— C"("+^'/^: every 
state in Fsym corresponds to a density matrix on H, and every map reversible transformation in Gwsym corresponds to a 
unitary on H. We know that the unitaries act transitively on H, and we have already shown that this action is irreducible 



(cf Lemma 43 in the appendix), so Fsym ^s transitive and irreducible. 



• 



The totally antisymmetric subspace in AiS^B, where A ~ C" and i? ~ C" C". lfG = U- U\ then this set of quantum 
states is invariant with respect to G ® G' , where G' = U ®U -U^ ®U^ . 

The face ¥ of AB with A = S ~ C" which consists only of the maximally entangled state, ¥ = {|'0+)(7/'_|_|}, where 
|V'+) — -7^ J27=i N) ® N)- ^^ ^^ U ® U-invariant. 

Coin tossing in environment with record. Suppose we have a classical coin (corresponding to one bit), and an envi- 
ronment whose state can be described by a bit string of length n—1. Initially, the joint system is in an uncorrelated state 
ip^B — ip^ (^ip^ . Since the coin's state is known to use (say, it shows heads), Lp^ is pure; on the other hand, we may not 
have full knowledge about the environment, meaning that ip^ is mixed. 



In contrast to the usual coin tossing example of Subsection U C we additionally assume that the environment always 
contains a perfect record of the coin's state. In other words, if the coin's state is (or heads), the environment's state must 
be some bit string from a set Sq; if the coin's state is 1 (tails), it must be some bit string from a set Si. Both Sq and Si are 
subsets o/{0, 1}"~^, have empty intersection, and we assume that they have the same cardinality. 

As a consequence, the possible configurations of the joint system are restricted to be either of the form Osq or Isi, where 
Sq e So and Si e Si. The possible states (that is, probability distributions) have their full support on those configurations. 
This defines a face ¥ of the joint state space AB. 

Since permutations can map every configuration of this kind to every other, ¥ is a transitive. Moreover, it is GG'- 
invariant: ifG£ Qa is a reversible transformation, there are only two possibilities. First, G is the identity. Then, setting 
G' also equal to identity yields a map G ® G' which preserves ¥. Second, G is a bit flip. Then, let T be a permutation 
which swaps Sq and Si (leaving all other strings invariant). Then G®T preserves ¥. 



We will study this scenario further in Example 39 below. 

The following lemma will be useful. 

Lemma 32. If¥ is a transitive GG'-invariant face, and if A is transitive, then ^^ — fj,^. 

Proof Let G G tj^ be arbitrary, and let E^ be any effect on A, then 

£:^(G~Vf ) = {E"^ ° G'^) «) u^M =E'^®u^ (G-i ® I(G ® G'{^lw))) ^ E"^ ® {u^ o G')(a*f) 
= E^®u^{p.r)^E^{4). 

Since this is true for all E^, we must have G^^ii^ — /i^ . But the only state which is invariant with respect to all 
reversible transformations on Ais p"^, hence /ip = fi^. D 

Another technical ingredient is this: 
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Lemma 33. Let AB he a transitive dynamical state space and F a transitive irreducible GG' -invariant face. Then F _L jlf. 
Proof. Suppose that a G F and G e Qv, then 



(a, /ip) = {Ga, Gfiw) = {Ga, fiw) 



GeGw 



Ga dG, jj,w) = (0, P-f) = 0. 



This proves the claim. D 

Theorem 34. Let ¥ bea transitive and irreducible GG' -invariant face on an irreducible dynamical state space AB, where A is 
also transitive and irreducible. Drawing a state lj^^ € F with fixed purity V{lo^^) randomly, the expected local purity is 



¥lV{i^'') = \Wf{X^®u'') 



"2' Kf- 



i.,.. 



.AB\ 



VM) , 



where K^ denotes the dimension of¥, X^ is any Pauli map on A, and vrf denotes the orthogonal projection onto the span of F 
(using the invariant inner product on [AB)'^). 



Taking Lemma 21 into account, it is clear that this theorem reduces to Theorem 22 in the case of F = VLab 



Proof. Abbreviate uj := lj^^ . Similarly as in Definition 12 call a linear map X : AB — > M a Pauli map on F if X(^f) = 
and {X, X) = 1, where X e F is the vector with {X,ui) = X{uj) for all uj G ¥. If X is a Pauli map on F, the same 
calculation as in the proof of Lemma 13 shows that 



GGSf 



{XoG{oj)Y dG = 



Kw-1 



for all u; G F. 



Due to Lemma |33] we also have (a), a)) = {lo — jlw,uj — Af) — i^,^) + (Af,Af)/ hence {lu,uj) — V{uj) — Vi^iw)- 

we have X^ ® u^M = X^{fi^) = X^(^^) = 0, hence ^X^ » u^ is a Pauli map on F, 



According to Lemma 

where c = X^ (E) u^ 
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I TTj [X^ ® w^ ) ^ 1 1 „ . Similarly as in the proof of Lemma 



22 



we have for all w e F 



E' 



'Ka-1 



¥1 I {X^ o G(w^)) dG = ¥l (X^ ® u^(G ® G'{uj))) dG 

GeGA JGeGA 

¥l{X^®u''iG®G'iuj))f dG = ¥l{X^(Eu''{u;)f ^ [ ¥l {X' 
GEGa JgeGj 

1 



u^(a;)) dG 



E,c^- 



GGSf 



w,w 



c / if F - 1 



a 



Combining all the little results proves the claim. 

In the quantum case, we can give an explicit description of the projector -Kf. 

Lemma 35. Suppose that A is a quantum state space, and -k is a projector onto some subspace. This subspace defines a face ¥ 
offlA by¥ = {p\ Tr(7rp) = 1} = {p \ irp-K = p}. Then 7rj(M) = t:Mtt - 7rTr(7rM7r)/(Tr7r). 

Proof Define Q{M) := ttMtt - 7rTr(7rA/7r)/(Tr7r) for all M e A, i.e. for all M with M = AP and TrM = 0. Clearly, 
Q{My = QiM) and Tr Q{M) = 0, hence we have a map Q : A ^ A. Furthermore, 

Q{Q{M)) = ttQ{M)tt - -^ TiinQ{M)n) = Q{M) - -^ Tr(Q(Af )) = Q{M), 

IrTT IrTT 

hence Q is a projector. Denote the Hilbert space dimension by d, then we get for the inner product on A 

-^{M, Q(N)) = Ti(MQ(N)) = Tr \m (ttNtt - -^ TrfvriVvr)') 1 = Tr(A/7rA^7r) - Tr(M7r) Tr(NTT), 

d LVTrTT /J (2—1 

and this expression is symmetric with respect to interchanging M and N (for the first addend due to the cyclicity of 
the trace). Thus Q is an orthogonal projector on A. 

The maximally mixed state /if on the face is /ip — ■k/{Tttt). Suppose that M € F, i.e. there is some p £ Qa such 
that M — p — p — 7r/(Tr7r). Then direct calculation shows that Q{M) = M, i.e. F C ranQ, and thus spanF C ranQ. 
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Now let TO := Trn (the dimension of the subspace), then the term nMn in the definition of Q creates an to x to 
block matrix, and the subsequent term 7rTr(7rAf7r)/(Tr7r) removes the trace of this block matrix, leaving rii^ — 1 
parameters. Thus, dim(ran Q) < m? — 1. On the other hand, density matrices in F are described by m? — 1 parameters, 
so dim(spanF) = m^ — 1. This proves that dim(ran(5) < dim(spanF). Altogether, this proves that spanF = ranQ, so 
that Q is the orthogonal projector onto the span of F as claimed. D 

Theorem 36. Let S bea subspace of dimension Ns on a bipartite quantum state space AB with Hilbert space dimensions Na 
and Nb, with the property that for every unitary U on A there is a unitary U' on B such that U ®U' S = S. Drawing a state 



p"" on S with fixed purity Tr [(p"^'")'^J randomly, the expected local quantum purity is 



^AB 



Ep^Tr[(p^)^] = 



A^^. -1 



•Tr 



{n{EA(i()lBVY 



Na Ni,-l 
where Ea — e\ is any matrix on A with Tr Ea — and Tr E\ — 1. 



Tr[(p^^f]- 



Ns 



Proof. The set of states on AB that have full support on S* is a face F on the quantum state space VIab- Since Ea is 
traceless, we have fjj WEaU dU = 0. Hence, if tt is the orthogonal projector onto S, we have 

Tt{t:{Ea ® lB)7r) = Tt{t:{Ea ® Is) = Tr ([/ ® U'-kU^ ® U'\Ea ® Is)) = Tr {-kU^ ® U'\Ea ® Ib)U (g) U') 

= Tr (7r(C/^^AC/) «) Is) = f Tr {tt{U'' EaU) (S)Ib) dU ^ Tr (n ( [ U^ EaU dU ] (»Ib ] = 0- 



u 



It is easy to check that X^(p) := J~^^Tv{Eap) is a Pauli map on A. Thus, X^ ® u^{p) = J w^ Ti{Ea ® Ibp), 

this proves that 



and so iX^ 



NaNb-1 
NaNb 



Na 



Na- 



j . Using Lemma 



( ) = £,abEa (8) Is, where £,ab 
TTf {X"^ (g) M^)^ = Ub TTp {Ea ® Is) = Ub I niEA <8) Is)7r 
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7rTr(7r(^A<»Is)7r) 

TlTT 



such that 



^f{X' 



(g>u^y 



eABMEA ® Ib)7t\\1 = els Tr {7r{EA <S> Is)7r)' 



NaNe 



NaNe 



1 



In order to apply Theorem 34 note that Ka = N\ and K^ = Ng, and the maximally mixed state on F is pw = 7r/(Tr vr), 
such that 



V{pw) = 



NaNb - 1 ""^^^^ NaNb - 1 NaNb - 1 V ^s 



Expressing all the purities V{a) in terms of Tr(o'^) via eq. Irojl and some algebraic simplification proves the claim. D 
In Theorem IT] in Subsection II D we apply this result to compute the average entanglement in symmetric and 



antisymmetric quantum subspaces. For the remainder of this subsection, we discuss the case of classical probability 



theory. In this case, we can explicitly compute the norm of the projector appearing in Theorem 34 



Lemma 37. Suppose that A and B are classical state spaces over Na and Nb outcomes, and F is any GG' -invariant face on 



AB, corresponding to Nf outcomes. Then ttj {X"^ <S) u 



.B\^ 



_ N^JNaNb - 1) 
2 NaNb{Na-1) 



and V{pf 



NaNb/N^ - 1 
NaNb - 1 



Proof. We use Theorem 34 First, the maximally mixed state pw is just the uniform distribution on the classical 



outcomes that generate FTthat is, a probability vector with Np entries equal to l/Nf and all others zero. Recalling 

to the 



the formula for purity in the classical case, eq. 1 7 1, gives V{jit) = 



NaNb/N^ - 1 



NaNb - 1 



. Now we apply Theorem 



34 



special case where the initial state is pure: V{ur^ ) = 1. Since there are no entangled states in classical probability 
theory, we know that uj^ must be pure as well, i.e. V{uj^) = 1, and so is its expectation value. Using that K = N 



classically, substituting all these identities into the statement of Theorem 34 yields the norm of the projector D 

Substituting this result back into Theorem [Tl we get a very simple statement regarding GG"-invariant faces in 
classical probability theory. The proof involves only simple algebra and is thus omitted. 
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Theorem 38. Suppose that A and B are classical state spaces, and F is any GG' -invariant face on AB. If we draw a random 
state io^^ in F of fixed purity Viuj^^), then the expected purity of the local marginal is 

where the right-hand side denotes the purity ofiu^^, computed by treating lu^^ as a state on the smaller state space F. 
Explicitly, if {(jjf^}j^i denote the entries of the probability vector, then 

AT ^P 1 



n--w^^E(-n 






(compare this with eq. dTl). The result of Theorem 38 is no surprise at all: we get the same result in the unconstrained 
case. Theorem 29 where the prefactors are cancelled due to N ^ K. 



Example 39 (Coin tossing in environment with record, part 2). Recall the scenario from the last paragraph of Example 31 
Does the record in the environment affect the randomization of the coin? Suppose the coin is initially in the pure state 0(or 
heads). Then the environment's initial state tp^ must have full support on So; for simplicity, we assume that it is otherwise 

' So- Apply 



completely unknown, i.e. the uniform mixture over Sq. Applying Theorem 38 a little calculation shows that 

1 



2#5o - 1 



This is exactly the same result as Theorem 29 gives us for an unconstrained environment B with Nb = #5*0. This is an 
environment which has half as many possible states as in the first scenario, where the possible environment configurations are in 
So U Si with cardinality 2#S'o. Intuitively, the informed environment loses one bit of randomization power due to redundancy. 
The same conclusion holds for correlated initial states. 

H. Theories which are not locally tomographic 

In the previous sections, we have considered certain types of composite state spaces: transitive locally tomographic 
compositions AB of state spaces A and B. At present date, there are no known examples of such theories beyond 
quantum theory and subspaces within it such as classical probability theory. The search for such theories has just 
started recently, but preliminary results suggest that theories of this kind might be rare 1531 . 

On the other hand, it is known that there is a multitude of transitive composite state spaces AB if the requirement 
of local tomography is dropped |54J. As it turns out, some of our results are easily generalized to theories without 
local tomography. We will sketch this in this subsection, but leave a more detailed analysis of such theories to future 
work. We start with a trial definition of arbitrary compositions of state spaces which need not be locally tomographic. 

Definition 40. If A and B are state spaces, a composition AB is any state space which can be decomposed as AB = [A® 
B) (B C such that the following properties hold: 

• Ifuj^ £ SIa and uj^ e il^, then uj"^ ®uj^ e VLab- 

• For uj^^ e ^AB, define the vector uj^ via L{uj^) :— L ® u^{lu^^) for all linear maps i : A ^ M. (An analogous 
definition yields co^.) Then oj^ <E ^a and uj^ e fls- 

Moreover, if A and B are dynamical state spaces, a dynamical composition AB is assumed to have the following property: if 
Ta e Ga and Tb G Gb, then {Ta ® Tg) ® Ic e Gab- 

Physically, this means that the global state space AB has some degrees of freedom (collected in C) that cannot be 
accessed locally at A or B, not even by comparing correlations of measurement outcomes. It follows from the second 
property that u"^^ — u^ ® u^ , because u^ u^ is a linear functional w^hich gives unity on all global states. In this 
notation, the tensor product of two linear functionals on A and B is assumed to act as the zero functional on C, i.e. 

L^®L^ = L^®L^® 0'='. 

The most famous example of a composite state space which is not locally tomographic is quantum theory over the 
reals 1.55 J : 
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Example 41 (Real quantum theory). Let A = {p G 



Tr p = l,p = p'^,p > 0}, that is, the set of (m x m)-density 



matrices with all real entries. The order unit is u^{p) = Trp. This is a state space of dimension Ka = m(m + l)/2. Similarly, 
let B be the state space of {n x n)-density matrices with all real entries. We assume -01,11 > 2. 

Then, a composition of A and B is given by the set of all {mn) x {mn)-density matrices with all real entries. Since Kab > 
KaKb, this is not a locally tomographic composition, but it is easy to check that it satisfies all the properties of Definition [4Q| 



Since A, B, and AB are state spaces in the usual sense, the results of Subsections 111 A to 
modification. As usual, if AB is transitive, it has a decomposition AB = (AB)^ M • /i"' 
that the locally inaccessible subspace C is part of the Bloch subspace, C C [AB)^. 
u-^^ {c) = u"^ ® u^{c) = 0. In more detail, we have the decomposition 



111 D apply without any 

. Moreover, we claim 

To see this, let c ^ C, then 



{AB)'' = {A(g,B)®{A(g) p^) ®{p'^®B)®C 

which follows from the fact that the right-hand side is a subspace V c AB of dimension dim V = d\n\{AB) — 1, and 
"* ® u^ evaluates to zero on all vectors of V . Now suppose that AB is irreducible - then all the addends 



,AB 



above are mutually orthogonal in the invariant inner product on [AB]' 
a E A, b E 13 and c E C, and compute 



For example, to see that A 1^ B ± C, let 



(a (g) &, c) = {{Ta «) Ii3 ® lc)a (g> 6, (Ta «) Is ® lc)c) = {Tao, ®b,c) = { TAadTA 6, c) = (0, c) = 0, 



using the same argumentation as in Subsection 111 E How is the maximally mixed state p on AB related to p 



and p^? To answer this question, extend the inner product on (AB)^ to an inner product on all of AB: for v,w E AB 



with decomposition v = v + v^p^^ and w 



w + WqP'^^, where Vq — 
{v,w) := {v,w) + voWq. 



,AB, 



v) and uiq 



,AB 



(w), we define 



This inner product is clearly invariant with respect to all reversible transformations from Gab, and it is constructed 
such that p^^ _L {AB}^. Taking into account the orthogonality of subspaces mentioned above, this proves that 



C 



■M 



AB 



{A(g) B)eiA(g) p^) © {p-^ (g) B) 



By integration as above, it is also easy to see that p^ ® p^ is perpendicular to all the three subspaces A® B, A® p^ , 
and p^ ® 13. Thus, p-^ ® p^ E C M • p^^ . In other words, there is some constant ^ e M and vector p'^ E C such that 
p'^®p^ = i-p"^^ ~p'~^ . Applying -u^-^ to this equation, using that u^'^(/x'^^) = u'^^ip^^p^) = 1 and u^-^(^'^) = 0, 
we get C = 1/ arid thus 



.AB 



p^® p^ +p^. 



The Bloch vector p^ can be interpreted as the collection of all locally inaccessible degrees of freedom of the maximally 
mixed state on AB. For symmetry reasons, we think it is plausible that p'~' = for many theories, but we were unable 
to prove this in generality. 



remain 



Following the argumentation in Subsection 111 E it is interesting to see that both Lemma 20 and Lemma 21 : 
valid if AB is not locally tomographic with only minor modifications. We now assume that A, B, scnaAB are 
transitive dynamical state spaces, w^here A and AB are irreducible. Lemma 20 becomes 



{x ®p^,y® p^) = {V{v^ ® p^) - Wp'^WI) {x, y), 



while Lemma 21 gets modified to stating that 



\{X^®u 



B\A\ 



is a Pauli map on AB. 



= ^V{^^®pB)-\\pC\\lX^ ® u^ 



The only modification of Theorem 22 is that KaKb has to be replaced by Kab, the dimension of the composite 



state space. The rest of the proof remains unaltered 
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Theorem 42. Let A, B, and AB he transitive dynamical state spaces, where A and AB are irreducible, and AB is not 
necessarily locally tomographic. Draw a state u>^^ e ^ab of fixed purity 'P{lo^^) randomly. Then, the expected purity of the 
local reduced state u^ is 



K^B-i p(sj-<»f.'')-ii/.=iir 

where (p^ is an arbitrary pure state on A, ^^ is the maximally mixed state on B, and fi^ is the vector which contains the locally 
inaccessible degrees of freedom of the maximally mixed state fj,"^^ on AB. 



We leave it open whether the results of Subsection III F (including a more operational formulation of the main 



result as in Theorem 29 involving only N and K) can be generalized to composite state spaces that are not locally 



tomographic: this seems to depend strongly on the question under what circumstances fj,^^ = fj,"^ (8) /i^ remains true 
such that fi'^ = 0. 



IV. SUMMARY AND OUTLOOK 

In summary, we considered general probabilistic theories and asked how mixed (impure) subsystems tend to be in 
such theories after undergoing reversible d5rnamics. We showed that under certain limited assumptions subsystems 
tend be close to maximally mixed in appropriate limits, and the amount of purity is given by a simple formula. Show- 
ing this involved developing various generalizations of the corresponding quantum concepts, e.g. purity, which are 
of interest in themselves. Our results also apply to subspaces within quantum theory, and we calculated for example 
the expected purity of subsystems in symmetric and antisymmetric spaces. 

We view our results as a significant first step towards formulating the second law as a 'meta-theorem', meaning a 
theorem that requires weaker assumptions than for example all of quantum theory. More generally we envisage a 
formulation of statistical mechanics independent of theory details. Such a formulation can be expected to be useful 
for example for black hole thermodynamics, where one cannot be certain that standard quantum theory applies, but 
may accept some more basic assumptions. 

Appendix A: Irreducibility of the Clifford group 

Here we prove a lemma which is used in the main text in Example [16 It shows that our generalized definition 
of a Pauli map reduces to the usual Pauli operators for the case of several qubits in quantum theory. It exploits the 
well-known fact that the Clifford group is a 2-design ||56] . 

Lemma 43. The Clifford group Ck on k qubits acts irreducibly by conjugation on the real vector space of traceless Hermitian 



Proof. We use the notation from Example 16 If there is a real subspace S C A which is invariant with respect to all 
Clifford maps, i.e. USW C 5* for all U e Ck, then its complexification 

S' -.= 8 + iS := {Si + iS2 \ S*!, ^2 £ S} C B{n) 

is a complex subspace of the set of all complex matrices B{'H) on the Hilbert space "H = (C^) which is also 

invariant with respect to all Clifford maps. Fix any orthonormal basis {|i)}f=i on H, and define a complex-linear 
map ^ : B —?' H (i^H (w^hich is related to the infamous Choi-Jamiolkowski isomorphism) by 

2*= 

$(M):= ^(z|Af|j)|z)®|j). 

It is a linear isomorphism which satisfies ($(M), ^{N)) = Tr(M'''iV). Moreover, we have ^{UMW) = (C/ (g) C7)$(M) 
for all unitaries U, where U denotes the complex-conjugate of U with respect to the given basis. It follows that 
the complex subspaces of B{H) w^hich are invariant under conjugation with respect to all unitaries U G Ck are in 
one-to-one correspondence with the complex subspaces oiH (i^H which are invariant under U ®U for all unitaries 

U e Ck. 



31 

An obvious invariant subspace m B{'H) consists of all complex multiples of the identity 1. The image is <I>(1) = 
Vi'^liprn) = J2i K) ® K)' "which reproduces the well-known fact that multiples of the maximally entangled state |^m) 
are invariant with respect to transformations of the form U (E) U-ln order to prove the lemma, we have to show that 
this subspace and its orthogonal complement (consisting of the traceless matrices in B{H) respectively of the vectors 
that are orthogonal to iV'm)) are the only non-trivial subspaces w^hich are C^ -invariant. 

It is well-known [56J that the Clifford group is a 2-design, i.e. 

i^E/^«^w^'«^')-X^„,,./^«^w^'«^V(/^?||^». + ?||^ ,Ai, 

for all AI e B{H), w^here tts and ttq denote the projectors onto the symmetric and antisymmetric subspaces oiH^H, 
respectively. We can write tt^ = (1 + F)/2 and ttq = (1 — F)/2, where ¥\i) ig) |j) = \j) ig) \i) is the swap operator. It 
iseasytoseethat2''(-0m|Af'^^|^-m> = Tr(A'fF), and F^« = 2'=|V'm)(V'm|, if Ts denotes the partial transposition on the 
second system. Moreover, it holds {A (E> BpC (g) T)^^ = A^ D^ p^^ C ® B^ \57i . Using these identities and applying 
Tb to eq. ||A1|, we get 



where tt^ := |^m) (V'ml and n:^^ := 1 — 7r,„. By Schur's Lemma, it follows that the one-dimensional subspace spanned 
by \tpm) and its orthogonal complement are the only non-trivial subspaces which are invariant with respect toU ®U 
forallf/eCfc. a 



Appendix B: Purity in boxworld 

As we show here, it is possible to define a notion of purity for generalized no-signalling theory Il24l , colloquially 
called hoxworld, even though this theory is not transitive [44]. However, it will turn out that the resulting notion of 
purity does not have all the nice properties that hold in the transitive case. 

To keep things simple, we will only consider the paradigmatic case of two observers (Alice and Bob), each carrying 
a square state space (a so-called "gbit", as shown in, and discussed around. Figure l2l. Operationally, this means 
that both Alice and Bob carry two measurement devices with two outcomes each ("yes" and "no"); local states are 
characterized by the two probabilities of the "yes"-outcomes. Both probabilities can be chosen independently, giving 
rise to two coordinates in a square state space. 

The two local state spaces are equal: A — B (we use the two different labels for convenience). Now we use a 
particular representation of the square state space introduced in [44] . We define the set of normalized states ^a as 
the convex hull of the four pure states 

w±± := 

As usual, the cone of unnormalized states is A+ := W^ ■ fJ^, and the order unit is u"^ = (1,0, 0)^, if we denote effects 
by vectors (such that u"^(a;) = (m"^, w) in the usual inner product). It turns out that the cone of effects A*^ is generated 
by the four effects 

i^=|l/y2l, M^-y= I -l/v/2 1 , Z=\ I, u^-Z 








The square state space is transitive. As discussed in Example [TT| the group of reversible transformations Qa is the 
dihedral group D^^. In the particular representation chosen here, it acts as on the y- and ^-components of a state 

/I 
vector Lj, and leaves the x-component (the normalization) invariant. The maximally mixed state is /i"^ = I . The 

set of unnormalized states on AB is defined as follows: 

{AB)+ := {we A(g)B I E^(EF^{uj) > for all E:-^ e A*^,F^ e B*^} . 
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That is, these are all the vectors with the property that all local measurements yield positive outcome probabilities. 
The bipartite (normalized) state space flAB consists of all cj e {AB)-^- with u^-^{uj) — 1, where the order unit is, as 
always, w^^ = u^ ® u^ . Since AB is 9-dimensional, ^ab is an 8-dimensional polytope, known as the no-signalling 
polytope. 

What are the pure states in AB? Clearly, the 16 product states uj±± uj±± are pure. But there are 8 additional 
entangled pure states: one of them is the famous PR box state LOpp, and the others can be obtained by local trans- 
formations from ujpR. We could use vectors with 9 entries to write down those states explicitly, but it will be more 
convenient to use another representation: given the three unit vectors ei, 62, 63, we will denote states (and vectors) 
uj e AB as matrices (w^ ), by using the decomposition oj = J2i ,=1 ^ij^i ^ ^j- Iri this representation, the maximally 
mixed state and the pure product states are 

/l 0\ /I t/^/2 u/y/2\ 

fi"^^ = 0, ujrs<»i^tu=\ r/V2 rt/2 ru/2 (r, s, t, u G {-1, +1}). 

V / \ s/V2 st/2 su/2 J 

One of the PR-box states is 

10 
ujpR^ \ 1/2 1/2 
1/2 -1/2 

What are the reversible transformations in AB? It can be shown 144] that these are exactly the local transformations, 
that is, those of the form Ga ^ Gb, together with the swap transformation S which exchanges the two subsys- 
tems. There are no other reversible transformations in Qab- The bipartite space AB decomposes into ^^s-invariant 
subspaces (the first addend cannot be decomposed further because D^ acts complex-irreducibly on A): 

AB = {A^B)(B{^i^®B®A(x)fi^)(B{R■^l'^^ /x^) . (Bl) 

4— dim . 4— dim . 1 — dim . 

In this notation, A denotes the subspace of vectors x £ A with u^(a;) ~ (we called this the "Bloch subspace" in 
Subsection [III A| . This shows that the only state on AB which is invariant with respect to all reversible transforma- 
tions is the maximally mixed state /i"^^ := /i"^ (g) /i^. We call the subspace generated by the first two addends above 
(AB)^, such that 

AB^{AB)''S)R-fi'^^. 

In other words, {AB)^ consists of all vectors x e AB with u^^{x) — 0. 



Now we proceed as in Section III to every state uj e ^Iab, we define the corresponding Bloch vector w as a) := 
u! — /i^^. This vector is obtained from the matrix representation above by replacing the "1" in the upper-left corner 
by a zero. Denote the usual Euclidean inner product by (•, •). Then we define the purity of ut as 

r{uj) ■.= c-{ij,Ld), (B2) 

and we choose the constant c > such that, say, the pure product states have purity 'P{uj±±) = 1. Using the 
representation above, it is easy to see that we must have c = 1/3. The resulting definition satisfies some of the 
properties mentioned in Lemma [8] 

• < Viuj) < 1 for all u e ^Iab, 

• V{lo) = if and only if w = fJ,^^, i.e. if uj is the maximally mixed state on AB, 

• VV is convex, and 

• V{Tijj) = V{lu) for all reversible transformations T e Qab and states lo E ^Iab- 

For example, to prove the last point, note that the local transformations on A and B are orthogonal in the chosen 
representation: they rotate and reflect the square. Hence their product is orthogonal as well, and so is the swap. It 
follows that 7'(rw) = {T(Jj,T(Jj) = {uj.T^Tuj) = {lu,uj) = V{uj). 
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However, there is some bad news: if we compute the purity of the pure PR box state, we get 

ViujpR) = -{ujpr,i^pr) = -. 

Even though this state is pure, it has purity (much) less than one. On transitive state spaces as considered in Sec- 
tion [Till this cannot happen: all pure states have purity 1. Vice versa, we can see from this result that there is no 
reversible transformation which maps a pure product state to a PR-box state: if there was one, then both states 
necessarily would have the same purity. 



Can we somehow avoid this problem? So far, we have been a bit hasty in our definition: in eq. I B2 1, we defined 
purity with respect to the usual Euclidean inner product, because all reversible transformations in Qab are orthogo- 
nal with respect to this inner product. However, the decomposition | |B1[ shows that this is not the only inner product 
on [AB)'^ (where the Bloch vectors w live) which has this property: it we have two vectors (^, Cj in this space, we can 
decompose them as 

Lp = Lp' + ip'\ if'eA(»B, (^" e (Ai^(g)i?)© (i(g)M^) 
and similarly for a), and then define 

(^,a;):=a(^',c.')+%",^"), 

where all brackets on the right-hand side denote the usual Euclidean inner product. For every choice of a, 6 > 0, this 
yields an invariant inner product on (AB)^. Is there a way to choose a and b such that the pu rity of pure product 



states and PR-box states equals unity at the same time? (We can retain c = 1/3 in definition |B2l and absorb any 
necessary factor into a and 6). Using that ^^ = ei and A = span{e2, 63}, it is easy to see that 

V{uJ±±) = -{a + 2b), r{upR)^--a. 

Both expressions can only be simultaneously equal to 1 if 6 = 0. But this ruins the inner-product property. If we 
ignore this, and go on with setting a = 3 and b ^ 0, we loose the property that 7^(w) = only for the maximally 
mixed state oj = /i'^^: for example, we get Vip.^ ® uj±) ~ 0. 

In summary, there is no definition of purity on bipartite boxworld which has all the nice properties that hold true 



in transitive state spaces. However, if we accept the existence of pure states with purity less than one, then eq. |B2 1 
can be a useful definition. A similar conclusion holds for other non-transitive state spaces. 
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